Automate Your Browser With Selenium

Sold to
andreea.cosma@fortech.ro
Automate your Browser with Python
Frank Anemaet
This book is for sale at http://leanpub.com/automateyourbrowserwithpython
This version was published on 2020-08-16
This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing
process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and
many iterations to get reader feedback, pivot until you have the right book and build traction once
you do.
© 2020 Frank Anemaet

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Install on Ubuntu Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Install on Mac OS X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Install on Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Selenium browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Navigate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Take screenshot automatically . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Click button . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
HTML Text input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Get HTML element text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Scroll page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Test if webpage contains text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Locate elements by XPath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Get all links from webpage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Get Links from Webpage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Radiobuttons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Dropdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Checkbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
css selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Wait for page load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Maximize window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Minimize window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
CONTENTS
Hide window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Window size and position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Tabs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Private mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Introduction
Welcome to the Python selenium course. In this book you will learn how to automate the web
browser with Python.
You can automate any web app or web sites. This means you can do automated testing, automate
boring or automate repetitive tasks, do web scraping and a variety of other things.
How does it work?
In essence, Python controls the Web browser like Firefox or Chrome with Python. The Python
script then instructs the browser what to do, like opening new pages, navigating, collecting data,
clicking or whatever the developer want the browser to do.
Selenium is the intermediary, making it possible to control web browsers with Python. More details
on how it exactly works are discussed in this book.
Supported browsers are:
• Firefox
• Internet Explorer (version 11)
• Safari
• Opera
• Chrome
• Edge
You can run Python selenium scripts on Microsoft Windows, Apple macOS and Linux.
This is an intermediary course. You should already know the basics of Python. After reading this
book, you can control these web browsers with Python and you’ll be able to to automate any web
app, web site, do web scraping, web testing and a lot more!
Chapter 1
First lets talk about the basic setup for Python web automation. So you start with Python code.
The codes will use a web driver and depending on which browser you want to automate, you need
a different driver.
So for Chrome there’s the chromedriver, for Firefox there’s the geckodriver and some on.
A webdriver controls a web browser. So depending on the driver it controls Firefox, Chrome, Opera
etcetera.
So you are not controlling the web browser directly but you control the Web browser via a Web
Driver
Control browser via WebDriver

Chapter 1 3
Setup
You need to install the selenium module and a web driver.
I will show you how to do that. The first thing to do is to install the selenium module for python.
You can do that with the command pip install selenium or pip3 install selenium depending on
your operating system.
Most of the times it’s pip install selenium.
So by typing the commands it will install the selenium module.
1 pip install selenium
You may want to do this in a virtual environment (virtualenv).

But this is not the only step you need to do.
You also need to install a web driver to install a web driver.
The Google Chrome and Chromium, the chrome driver is available on https://chromedriver.chromium.org/
For Mozilla Firefox, you can fetch it from https://github.com/mozilla/geckodriver/releases/tag/v0.26.0
(scroll to bottom).
In the next paragraphs, we’ll discuss an easy way to install selenium for the most popular operating
systems.
Chapter 1 4
Install on Ubuntu Linux

First make sure you have the pip package manager installed, on Ubuntu this maybe called pip3.
If it’s not installed you can use apt-get to install it
1 sudo apt-get -y install python3-pip
For Ubuntu Linux, you can follow the steps below.
1 pip3 install selenium
You must install the selenium driver too.
Firefox
Download and install the Firefox browser. The next thing you want to do is to install the driver. You
can do that manually by downloading the Firefox driver from https://github.com/mozilla/geckodriver/releases/tag/v
(change to your version) . Then extract it, copy it to /usr/local/bin and make it executable using the
command chmod +x.
Alternatively, copy the contents of the script below and save it as firefox.sh
1 echo "Linux script, for Mac OS install geckodriver manually."

2 echo ""
3 echo "Checking latest version..."
4 wget -q https://github.com/mozilla/geckodriver/releases/latest
5 version=$(cat latest | sed -n -e 's!.*<title>Release $.*$</title>.*!\1!p')
6 version=$(echo $version | sed 's/ .*//')
7 echo $version
8
9 echo "Installing geckodriver..."
10 wget https://github.com/mozilla/geckodriver/releases/download/v$version/geckodriver-\
11 v$version-linux64.tar.gz
12 tar -xvzf geckodriver-v$version-linux64.tar.gz
13 rm geckodriver-v$version-linux64.tar.gz
14 chmod +x geckodriver
15 cp geckodriver /usr/local/bin/
Make it executable with the command below

Chapter 1 5
1 chmod 755 firefox.sh

2 ./firefox.sh
Then type this command to install the Firefox driver. It will install the web driver for you.
Chapter 1 6
Chrome
If you want to use Chrome, install the Chrome web browser and the driver. You can download it
from https://chromedriver.storage.googleapis.com/
Then extract it and instal it.
You can use the chrome install script below, save it as chrome.sh:
1 LATEST=$(wget -q -O - https://chromedriver.storage.googleapis.com/LATEST_RELEASE)
2 wget https://chromedriver.storage.googleapis.com/$LATEST/chromedriver_linux64.zip
3 unzip chromedriver_linux64.zip && sudo ln -s $PWD/chromedriver /usr/local/bin/chrome\
4 driver
Then run these commands:
1 chmod 755 chrome.sh

2 ./chrome.sh
If these commands ran successfully, the selenium module and the webdriver are installed.
There are the scripts test-firefox.py and test-chrome to test if installation was successful.
Installation output on Ubuntu Linux

Chapter 1 7
Install on Mac OS X
If you use Apple Mac OS X, you can install selenium and the web driver too. The process is similar
to installation on Linux.
First open a terminal.
Firefox
Create this script and save it as firefox.sh
1 echo "Web Driver install for Mac OS X (Firefox, Geckodriver)."

2 echo ""
3 echo "Checking latest version..."
4 wget -q https://github.com/mozilla/geckodriver/releases/latest
5 version=$(cat latest | sed -n -e 's!.*<title>Release $.*$</title>.*!\1!p')
6 version=$(echo $version | sed 's/ .*//')
7 echo $version
8
9 echo "Installing geckodriver..."
10 wget https://github.com/mozilla/geckodriver/releases/download/$version/geckodriver-$\
11 version-macos.tar.gz
12 tar -xvzf geckodriver-$version-macos.tar.gz
13 rm geckodriver-$version-macos.tar.gz
14 chmod +x geckodriver
15 cp geckodriver /usr/local/bin/
To install Selenium Firefox for Mac OS X:

2 chmod 755 firefox.sh
3 sudo ./firefox.sh
4 python3 test-firefox.py
Chrome
For Chrome it’s similar. You should install the driver and the selenium module. If you don’t have
Chrome installed, you should install the browser too.
First install the selenium module
Chapter 1 8
Next the driver needs to be installed.

Copy the contents below and save it as chrome.sh
1 LATEST=$(wget -q -O - https://chromedriver.storage.googleapis.com/LATEST_RELEASE)
2 wget https://chromedriver.storage.googleapis.com/$LATEST/chromedriver_mac64.zip
3 unzip chromedriver_mac64.zip && sudo ln -s $PWD/chromedriver /usr/local/bin/chromedr\
4 iver
To install the Chrome version on Mac OS X use:
1 chmod 755 chrome.sh

2 sudo ./chrome.sh
This will install both the web driver and the selenium module.
Installation output on Mac OS X

Chapter 1 9
Install on Windows
To install Selenium on Windows 10:
1. Install Python
2. Install Latest Web Browser (Firefox, Chrome, Chromium…)
3. Download Web Driver from https://www.seleniumhq.org/download/
4. Add drivers path to environment variable PATH
When using the program, you can set the path to the driver like this:
1 from selenium import webdriver

2
3 options = webdriver.ChromeOptions()
4 driver = webdriver.Chrome(executable_path="path/to/chromedriver.exe", chrome_options\
5 =options)
Chapter 2
Basics
you can try all examples (resources available) and run them on your own. If Selenium is installed,
it should work automatically. Then read this book or watch the videos to understand them better.
Run the programs from inside the directory (they use websites/*.html to demonstrate).
You may want to change the browser, depending on your web driver. By default is uses firefox.
Selenium examples
Chapter 2 11
Selenium browser
You can programmatically control a web browser with selenium. The selenium module enables you
to control Firefox or Chrome or even mobile browsers.
Import the selenium module and import time module. Then initialize Firefox.
We’ll import several modules:

2 import time
3 import os
The selenium module to control the web browser, the time module to create pauses and the os
module is nice to have.

2 import time
3 import os
4
5 # start webdriver
6 driver = webdriver.Firefox()
Then we get open a website using Firefox.
1 url = 'https://python.org'
2 driver.get(url)
So the driver is actually a link to the web browser.

If you run this code you’ll see a web browser pop up (Firefox) and it will show the website python.org
pragmatically.
This will work for any website or web app.
Chapter 2 12
Navigate
Selenium is often used for testing, a simple test could be to check if the webpage title is equal to the
expected title.
You can open a browser and navigate to a url as shown in the example below, but it doesn’t validate
that the opened url is actually working (you could have a lost internet connection, a timeout, page
not found etc).

2 import time
3 import os
4
5 # start webdriver
6 browser = webdriver.Firefox()
7
8 # navigate to URL
9 url = 'https://python.org'
10 browser.get(url)
You can access the title using the .title attribute. This contains the page title. Verification can be done
with an if-statement.

2
3 # Create an instance of Firefox WebDriver
5
6 # Navigate to URL
7 browser.get('https://python.org')
8
9 # Verify webpage title
10 if browser.title == "Welcome to Python.org":
11 print ("Page title is correct")
12 else:
13 print ("Page title is incorrect")
14
15 # Quit the browser window
16 browser.quit()
Chapter 2 13
Mimick browsing behavior

So if you were to specify another Web site, you can open that
1 driver.get("https://stackoverflow.com")
This way you let Python mimick web browsing behavior

2 import time
3 import os
4
5 # start webdriver
7
8 # Open webpage
9 driver.get("https://python.org")
10 time.sleep(5)
11
12 # Open 2nd webpage
13 driver.get("https://stackoverflow.com")
14 time.sleep(5)
15
16 # Open 3rd webpage
17 driver.get("https://reddit.com")
18 time.sleep(5)
To close the browser window programatically, use
1 driver.quit()
Chapter 2 14
Back, refresh
There are other navigation features you can use, like go to the previous page with .back() or to the
next page with .forward(). The browser controlled by selenium actu
ally keeps a history, just like a regular browser does. You can refresh the page with .refresh().
The example below starts the Firefox browser, maximized the window, opens a url, then opens
another url and goes back and refreshes. In short: all the navigational features of the browser you
normally use can be accessed programmatically.

2
3 # Open browser
5
6 # maximize browser window
7 driver.maximize_window()
8
9 # open URL
10 driver.get("https://pythonbasics.org")
11
12 # show page title in console
13 print(driver.title)
14
15 # show current URL
16 print(driver.current_url)
17
18 # open another URL
20
21 # click back
22 driver.back()
23
24 # click refresh
25 driver.refresh()
26
27 #to close the browser
28 driver.close()
Chapter 2 15
Take screenshot automatically

If you want to take a screenshot of a Web site, you can call to ‘drivers methods
1 get_screenshot_as_file(filename)
So you can do the following:

2 import time
3
4 # Start web browser
6
7 # Open URL
8 driver.get('https://python.org')
9 time.sleep(1)
10
11 # Take screenshot and save in same directory as program
12 driver.get_screenshot_as_file('image.png')
13
14 # Wait and quit browser
15 time.sleep(1)
16 driver.quit()
So you run this code, it will open the website and take a screenshot. The screenshot will be saved as
a .png file in the same directory as your Python script. You can define a path if you want to, like the
Pictures directory.
Note that this only takes a screenshot of the visible part of the webpage. If you want to take a full
webpage screenshot, scroll down or flip a few pages.
Chapter 2 16
Get HTML Source

Every webpage on the internet is written in HTML, “Hyper Text Markup Language”. In HTML every
webpage element is defined by a tag: the greater than and smaller than symbols. Like <b> for bold,
<u> for underline, <p> for paragraph etc.
Selenium uses HTML to interact with the browser.
Anything you see on the web is an HTML element and can be identified, either by unique id or
xpath.
If you have never worked with HTML before, it is useful to learn it for web automation. But because
there is to much to be written about HTML, it is outside the scope of this book.
Is it possible to see the HTML source of a webpage? Yes it is. The program below loads a webpage
and shows the HTML code. Selenium calls the HTML source page_source.

2 import time
3
4 # Open a web browser window
6
7 # Navigate to a webpage
9
10 # Get the webpage html source code and save it in the variable html
11 html = driver.page_source
12
13 # Output the html code
14 print(html)
15
16 # Wait and close web browser
17 time.sleep(2)
18 driver.quit()
The variable page_source contains the current webpage’s HTML content.

Note: You can also see the html page source from the browser: Right click, view page source.
Chapter 2 17
Driver variables
The selenium driver object stores more variables: the current url and the webpage title. These
variables change each time you open a new url using the get() method.
The program below opens a website using driver.get(), then outputs the page url and the page title.

2 import time
3
4 # Start the web browser
6
7 # Open url
9
10 # Get current page url
11 print(driver.current_url)
12
13 # Get current page title
15
16 driver.quit()
Both variable .title and .current_url change with each new webpage you visit.
Chapter 2 18
Take bulk screenshots

You can easily visit us the list of Web sites automatically. So initialize the driver and then specify
your list of websites, which can be any sites and then for every Web site in the list of sites gets the
Web page and delay.
So you might also want to print title.

2 import time
3
5 web = [ 'https://python.org',
6 'https://stackoverflow.com',
7 'https://pypy.org' ]
8
9 for w in web:
10 # get url
11 driver.get(w)
12
13 # output page title
15 time.sleep(1)
16
17 driver.quit()
And if you’re done run it you’ll see it visits each of those Web sites and print city title.
You can also do this for a bunch of webpages:

2 import time
3
5
6 web = [ 'https://python.org',
7 'https://stackoverflow.com',
8 'https://pypy.org' ]
9
10 i = 1
11 for w in web:
12 # get url \
13 \
14
Chapter 2 19
15 driver.get(w)
16
17 # output page title \
18 \
19
21 time.sleep(1)
22
23 driver.get_screenshot_as_file(f'image-{i}.png')
24 i = i + 1
25
26 driver.quit()
Each image will then be stored by a unique number.
Full page screenshot

If you want to take a full webpage screenshot, you can use the code below. It uses a Javascript trick
to capture the full page.
1 #coding=utf-8
2
3 import time
5
8
9 S = lambda X: driver.execute_script('return document.body.parentNode.scroll'+X)
10 driver.set_window_size(S('Width'),S('Height')) # May need manual adjustment
11
12 driver.find_element_by_tag_name('body').screenshot('screenshot.png')
13 driver.quit()
Chapter 2 20
Click button
You can make selenium click a button automatically. This can be any button on a web page.
You will be able to call a button like this:
1 python_button.click()
In order to find the button, you can use the button text.
1 python_button = driver.find_element_by_xpath("//button[contains(text(),'Click Me!')]\

2 ")
You’ll notice we use the method
1 find_element_by_xpath()
So what is an XPath?
XPath is a query language for selecting nodes from an XML document, in this case the webpage
(HTML). You don’t have to worry to much about this, as you can get the XPath of any web element
using Chrome Developer Tools or Firefox Developer Tools. Both let you find the XPath (in the videos
we clearly show how).
A complete example is shown below, where it opens a webpage and clicks on the button. The button
reacts with an javascript alert, which selenium then clicks automatically too.

2 import time
3 import os
4
5 # start webdriver and load website from disk
7 url = 'file:///' + os.getcwd() + '/websites/button.html'
8 driver.get(url)
9 time.sleep(1)
10
11 # click button
12 python_button = driver.find_element_by_xpath("//button[contains(text(),'Click Me!')]\
13 ")
Chapter 2 21
If the web browser creates a popup, you can also click on that. The driver lets you switch to an alert
and clicking the accept button, causing the popup window to be gone.
1 # close javascript popup

2 time.sleep(1)
3 alert = driver.switch_to_alert()
4 alert.accept()
5
6 time.sleep(5)
7 driver.quit()
Chapter 2 22
HTML Text input
Textfield input
Many websites provide somet type of interaction through html input fields. Search engines let you
type a search query, forms let you fill in your information etc.
An input field can be completed automatically with Python selenium. The program below opens the
search engine duck.com and automatically types a search query.
The example uses the method find_element_by_id() to find the input box, by its unique id. You can
see this id by viewing the webpage source in your browser. Sometimes an element doesn’t have
unique id, in which case you can use its xpath.
The method call send_keys() enters text into the selected text field. This can be any type of text you
want.

2
5
6 # Navigate to URL
7 browser.get('https://duck.com')
8
10 if browser.title == "DuckDuckGo — Privacy, simplified.":
12 else:
14 exit()
15
16
17 # Find field by id
18 textfield = browser.find_element_by_id('search_form_input_homepage')
19 textfield.send_keys('ducks')
20
21 # Quit the browser window
22 #browser.quit()
Finally, to do the actual search and not only type in the box you can send the enter key. To send the
enter key, first add another input:
Chapter 2 23
1 from selenium.webdriver.common.keys import Keys
Then call return (not enter).
1 textfield.send_keys(Keys.RETURN)
The program below does a search on the website duck.com automatically, but this principle works
for any search engine.

3
6
7 # Navigate to URL
8 browser.get('https://duck.com')
9
11 if browser.title == "DuckDuckGo — Privacy, simplified.":
13 else:
15 exit()
16
17
18 # Find field by id
19 textfield = browser.find_element_by_id('search_form_input_homepage')
20 textfield.send_keys('ducks')
21
22 # Search
23 textfield.send_keys(Keys.RETURN)
Chapter 2 24
If you have a web page with text fields like a first name field, last name or different input fields you
can make the computer type those text fields automatically.
Textfields
You can use the HTML element id or XPath to find the element.
1 field1 = driver.find_element_by_id("FirstName")
Then send keyboard keys into that textbox
1 field1.send_keys('Donald')
You’ll want to wait a second, as you don’t want it to select the next element while typing.
1 time.sleep(1)
Chapter 2 25
An example code is shown below (You can run the attached code):

2 import time
3 import os
4
7 url = 'file:///' + os.getcwd() + '/websites/textfields.html'
8 driver.get(url)
9 time.sleep(1)
10
11 # enter username
12 field1 = driver.find_element_by_id("FirstName")
13 field1.send_keys('Donald')
14 time.sleep(1)
15
16 # enter password
17 field2 = driver.find_element_by_id("LastName")
18 field2.send_keys('Mouse')
19 time.sleep(1)
20
21 # close browser
22 time.sleep(5)
23 driver.quit()
Chapter 2 26
Textbox input
If you want to fill a text area or text box (instead of text field). That’s possible too.
In principle that works the same, but you want to add newline characters (backslash n)

2 import time
3 import os
4
7 url = 'file:///' + os.getcwd() + '/websites/textbox.html'
8 driver.get(url)
9 time.sleep(1)
10
11 # enter text
12 field1 = driver.find_element_by_id("textbox")
13 field1.send_keys('Selenium example text\nAnother line\nAnother line..')
14 time.sleep(1)
15
16 # close browser
17 time.sleep(5)
18 driver.quit()
Textbox
Chapter 2 27
Get HTML element text

If you have an HTML element, you can also grab it’s text. This could be a text in a paragraph, an h1
title, h2 title etcetera.
If your HTML element has a unique id, you can do it in one line. You’ll want to change to the correct
id.
1 print( driver.find_element_by_id("title1").text )
If not, you can use the XPath that you can grab using Firefox or Chrome developer tools.
Example code:

2 import time
3 import os
4
7 url = 'file:///' + os.getcwd() + '/websites/element.html'
8 driver.get(url)
9 time.sleep(1)
10
11 # get element text
14 print( driver.find_element_by_id("text1").text )
15 print( driver.find_element_by_id("text2").text )
16 print( driver.find_element_by_id("profile").text )
17
18 time.sleep(5)
19 driver.quit()
Chapter 2 28
Scroll page
There are multiple ways to scroll down a webpage: by javascript injection and by sending key strokes.
You can execute javascript by calling the method driver.execute_script(‘’).
Where Y is the height.
1 driver.execute_script("window.scrollTo(0, Y)")
To scroll to the bottom of the page, you can do this:
1 driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
The program below opens a website, then scrolls to the bottom of the page using javascript injection.

2 import time
3
6 driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
7 time.sleep(2)
8 driver.quit()
I mentioned you can scroll by keypress, that works with the method send_keys.

2 html = driver.find_element_by_tag_name('html')
3 html.send_keys(Keys.END)
This code opens a webpage and scrolls the page using send_keys().

3 import time
4
7 html = driver.find_element_by_tag_name('html')
8 html.send_keys(Keys.END)
9 time.sleep(2)
10 driver.quit()
Chapter 2 29
Test if webpage contains text

Selenium can be used for automated testing but also for web scraping. You may want to test if a text
exists on a webpage or do other types of tests.
The program below visits the oldest webpage on the internet and tests if the page title has changed.

3 import time
4
5 # to open Firefox web browser and maximize the window
7
8 # open webpage
9 browser.get("http://info.cern.ch/")
10
11 # test if page contains the text in <h1>
12 element = browser.find_element_by_tag_name('h1')
13 print(element.text)
14 assert element.text == 'http://info.cern.ch - home of the first website'
15
16
17 # exit browser
18 browser.quit()
If the text has changed, it will throw an AssertionError.
1 Traceback (most recent call last):

2 File "tmp.py", line 14, in <module>
3 assert element.text == 'monkeys'
4 AssertionError
This works for any type of element and text, so you can test if a web app or webpage gives the
correct output.
Chapter 3
Locate elements by XPath
On a web page each HTML element should have a unique id or a unique name, on which you can
get the HTML element.
1 element = driver.find_element_by_name("...")
2 element = driver.find_element_by_id("...")
In the web page code (HTML code) you can find those unique ids and names. For example, something
like this:
Webpage (HTML) elements with a unique id
But in practice that’s often not the case, element do not always have a unique id or a unique name.
In that case you can select an HTML element using it’s XPath. XPath will give you the path to the
element.
The general ways to grab an HTML element using selenium:
1 element = driver.find_element_by_xpath("...")
To get the XPath you can use your browser. The way to get it depends on which web browser you
use. In general it’s like this: open your web browser on the page, start developer tools and grab the
XPath.
Chapter 3 31
XPath in Firefox
In the Mozilla Firefox browser:
• press (F12 key or Ctrl+Shift+i)

• click on Elements or Inspector
• select the page inspector (Ctrl+Shift+C), the little arrow, and go to the element.
• right click on the HTML element > Copy > Copy Full Xpath.
XPath in Mozilla Firefox
This will return a path like this: /html/body/div/div[2]/div[2]/div[5]/div[3]/span/a

So you can basically use it to select elements, where it returns an XPath query. You can then grab
your HTML element using the same XPath.
1 element = driver.find_element_by_xpath("/html/body/div/div[2]/div[2]/div[5]/div[3]/s\
2 pan/a")
Chapter 3 32
Xpath in Chrome
From Chrome :
• Open Chrome Developer Tools

• Right click “inspect” on the item you are trying to find the xpath
• Right click on the highlighted area on the console.
• Go to Copy xpath
XPath in Google Chrome
This will return a path like this:
1 /html/body/div/div[2]/div[2]/div[5]/div[3]/span/a
So you can basically use it to select elements, where it returns an XPath query.
You can then grab your HTML element using the same XPath.
1 element = driver.find_element_by_xpath("/html/body/div/div[2]/div[2]/div[5]/div[3]/s\
2 pan/a")
Chapter 3 33
Get all links from webpage

You can get all links from a webpage using an XPath "//a[@href]". By calling the method
driver.find_elements_by_xpath(“//a[@href]”), it will return a list of all links. Depending on the
webpage, sometimes the domain or protocol (https, http, ftp) is not included.
1 elems = driver.find_elements_by_xpath("//a[@href]")
2 for elem in elems:
3 print(elem.get_attribute("href"))
The program below gets the links from a webpage using the method find_elements_by_xpath. It
then iterates over the list of links and outputs the href value (link value) for each one. In HTML,
links do not only have a link value href, but also a description .text.

2 import time
3
4 # Start browser
6
7 # Open url
9
10 # get links from webpage
12
13 # go over list of links and print each one
16
17 time.sleep(2)
To get the text of the links, you can change it into:
1 # show link texts for each link

3 print(elem.text)
Chapter 4
Get Links from Webpage
You can grab all the links from a webpage. In a webpage (HTML), a link is defined using the syntax:
1 <a href="www.website.com">website</a>
That’s why you can grab links using the Xpath:
Here we use find_elements_by_xpath() not find_element_by_xpath().

Then a for loop is used to iterate over them:

3 print(elem.get_attribute("text"))
That gives us this code:

2 import time
3 import os
4
7 url = 'file:///' + os.getcwd() + '/websites/links.html'
8 driver.get(url)
9 time.sleep(1)
10
14 print(elem.get_attribute("text"))
15
16 driver.quit()
This will output all the links on the webpage, and all the text that is used to describe the links.
Chapter 4 35
Radiobuttons
If your webpage has radiobuttons, you can click on that automatically. To do so, get the radiobuttons
id, name or xpath, then call the .click() method.
1 from selenium.webdriver.support.ui import Select

3 import time
4 import os
5
8 url = 'file:///' + os.getcwd() + '/websites/radio.html'
9 driver.get(url)
10 time.sleep(1)
11
12 # select element
13 radio = driver.find_element_by_xpath('/html/body/form/input[3]')
14 radio.click()
15 time.sleep(2)
A webpage with radio buttons:
Radio buttons
Chapter 4 36
Dropdown
You can select an item from a dropdown menu or listbox. This works slightly different, as you have
to import Select from selenium. Then there are the methods select_by_visible_text() and select_by_-
value().

3 import time
4 import os
5
8 url = 'file:///' + os.getcwd() + '/websites/dropdown.html'
9 driver.get(url)
10 time.sleep(1)
11
12 # select element
13 select = Select(driver.find_element_by_id('country'))
14 time.sleep(2)
15
16 # select by visible text
17 select.select_by_visible_text('Germany')
18
19 # select by value
20 #select.select_by_value('5')
Chapter 4 37
Drop down menu
Checkbox
Selenium can click on one or more checkboxes for you. Just get the element using xpath, id or name
and call the .click() method.

3 import time
4 import os
5
8 url = 'file:///' + os.getcwd() + '/websites/checkbox.html'
9 driver.get(url)
10 time.sleep(1)
11
12 # select element
13 cbox = driver.find_element_by_xpath('/html/body/form/input[3]')
14 cbox.click()
15 time.sleep(2)
16
Chapter 4 38
17 # select element
18 cbox = driver.find_element_by_xpath('/html/body/form/input[1]')
19 cbox.click()
20 time.sleep(2)
Checkbox
Chapter 4 39
Table
Webpages often have tables, you can grab all the table data with Python selenium. You can get all
rows, columns and other data. The first thing to do is to import the modules:

2 import time
3 import os
Then open a webpage, this can be a webpage on the internet or on your computer. As long as the
webpage has a table on it, it will work.
1 driver.get(url)
The table element can be fetched using find_element_by_id() or find_element_by_xpath(). k

2 import time
3 import os
4
7 url = 'file:///' + os.getcwd() + '/websites/table.html'
8 driver.get(url)
9 time.sleep(1)
10
11 # select table
12 table = driver.find_element_by_id('table')
13
14 # iterate through table
15 rows = table.find_elements_by_tag_name("tr")
16 # get row
17 for row in rows:
18 if len(row.find_elements_by_tag_name("td")) > 0:
19 col = row.find_elements_by_tag_name("td")[1]
20 print(col.text)
21
22 time.sleep(1)
23 driver.quit()
Chapter 4 40
Table on webpage
Chapter 4 41
Count rows in table

You can count the number of rows in a table automatically. To do so, you count the number of <tr>
tags, which means table rows. First open a webpage, then find the table using id or xpath and then
count the number of tr rows.
In HTML, a table is always constructed in this matter:
1 <table>
2 <tr>Row 1</tr>
3 <tr>Row 2</tr>
4 <tr>Row 3</tr>
5 </table>
That’s why counting the number of <tr> tags, is counting the number of cells. Inside a row you can
always find multiple columns (<td>), but those are irrelevant because we count the number of rows.
The example below opens a website and counts all the rows in the table shown. This works for any
driver including Firefox, Chrome etc. If you change the website, also change the tables xpath (you
can find it with developer tools inspector).
1 import time
3
6
7 # Maximize the browser window
9
10 # Navigate to page
11 driver.get("https://news.ycombinator.com")
12
13 # Find the table element in the page
14 table = driver.find_element_by_xpath("/html/body/center/table/tbody/tr[3]/td/table")
15
16 # Find the #tr elements in the table
17 rows = table.find_elements_by_xpath("//tbody/descendant::tr")
18 print(f"Total rows: {len(rows)}")
19
20 # Pause the script for 3 seconds
21 time.sleep(3)
22
23 # Close the browser
24 driver.close()
Chapter 4 42
Table to list
Tables on any webpage can be converted into Python lists. Start by importing the selenium module
and starting a webdriver object, this can be Firefox, Chrome or any of the other supported browsers.
Optionally you can maximize the window, but even if the window is minimized or hidden selenium
continues to work.
1 import time
3
6
Open a webpage with a table on it. There can be more than one table on the page, as long as you
specify the correct xpath later.
1 # Open webpage
2 driver.get("https://www.w3schools.com/html/html_tables.asp")
The table element is located by xpath, then the rows are grabbed.
1 # Get table
2 table = driver.find_element_by_xpath("/html/body/div[6]/div[1]/div[1]/div[3]")
3
4 # Get rows
You can always use the xpath, sometimes you can use id.
A loop is executed to go over every row, and every column. The text in each cell is added to the list.
After the loop has completed, the table data is in the Python list.
The list in question is named result_data, and this is a nested loop: two loops in one. That’s because
it wants every row and every colum converted into a list (two dimensional into one dimensional).
The values len(rows) and len(cols) are the width and height counted in number of cells.
Chapter 4 43
1 for i in range(0,len(rows)):
2 ...
3 for j in range(0,len(cols)):
4 ...
That’s because it wants every row and every colum converted into a list (two dimensional into one
dimensional). The values len(rows) and len(cols) are the width and height counted in number of
cells.
For every row in the loop, you can get the number of columns by counting the number of <td> tags.
1 cols = rows[i].find_elements_by_tag_name('td')
Then you can get the cell data using cols[j].text.encode(‘utf-8’). The example below shows the
complete loop that converts the html table to a python list.
1 # Get text in each column

4 cols_data = []
6 # Get the text
7 cols_data.append(cols[j].text.encode('utf-8'))
8 result_data.append(cols_data)
Then print the table, otherwise it’s just in the computers memory.
1 # Print the result list

2 print(result_data)
You may want to close the browser as well.
1 # Pause the script for 3 sec

2 time.sleep(3)
3
5 driver.close()
Chapter 4 44
If you change the website, also change the xpath. Because the xpath is unique for every website. You
can get the xpath using Firefox dev tools or Chrome dev tools.
1 import time
3
6
9
10 # Open webpage
11 driver.get("https://www.w3schools.com/html/html_tables.asp")
12
13 # Get table
14 table = driver.find_element_by_xpath("/html/body/div[6]/div[1]/div[1]/div[3]")
15
16 # Get rows
18
19 # Create a list to store the text
20 result_data = []
21
22 # Get text in each column
25 cols_data = []
27 # Get the text
28 cols_data.append(cols[j].text.encode('utf-8'))
29 result_data.append(cols_data)
30
31 # Print the result list
32 print(result_data)
33
34 # Pause the script for 3 sec
35 time.sleep(3)
36
38 driver.close()
Chapter 4 45
css selector
You can also select by css style (webpage style, from cascading style sheets). Each html element often
has a style. You can use this to select any webpage element.
1 driver.find_element_by_css_selector()
The program below uses css to select an html element #button.

2 import time
3 import os
4
7 url = 'file:///' + os.getcwd() + '/websites/button.html'
8 driver.get(url)
9 time.sleep(1)
10
11 # click button
12 python_button = driver.find_element_by_css_selector("#button")
14
15 # close javascript popup
16 time.sleep(1)
17 alert = driver.switch_to_alert()
18 alert.accept()
19
20 time.sleep(5)
21 driver.quit()
Chapter 4 46
Wait for page load

If you want to wait for the page to complete loading, instead of using time.sleep() you can do the
following:

2 from selenium.common.exceptions import TimeoutException
3 from selenium.webdriver.support.ui import WebDriverWait
4 from selenium.webdriver.support import expected_conditions as EC
5 from selenium.webdriver.common.by import By
6
7 # Open web browser
9
10 # Open URL
11 driver.get('https://python.org')
12
13 # Maxmimum wait time
14 timeout = 3
15 try:
16 element_present = EC.presence_of_element_located((By.ID, 'main'))
17 WebDriverWait(driver, timeout).until(element_present)
18 except TimeoutException:
19 print("Timed out waiting for page to load")
20 finally:
21 print("Page loaded")
Chapter 4 47
Check if element exists
Find by css selector
You can let selenium check if an HTML element exists. One way to do that is using css selectors.
The method find_elements_by_css_selector() lets you select the element with the selector.
1 if driver.find_elements_by_css_selector('#element'):
2 print("Element exists")
The program below opens a webpage and finds the submit button.

2 import time
3
4 # Start web brower
6
7 # Open url
9
10 if driver.find_elements_by_css_selector('#submit'):
12 else:
13 print("Element does not exist")
14
15 driver.quit()
If unsure, you can grab the css selector using the developer tools in Firefox and Chrome.
Chapter 4 48
Find by id
If an HTML element has a unique id, you can verify it exists by calling find_elements_by_id() as
shown in the program below.

2 import time
3
4 # Start browser
6
7 # Open url
9
10 # Use HTML id 'submit' to find element
11 e = driver.find_elements_by_id('submit')
12 if e:
14
15 driver.quit()
Find by xpath
Finally, you can always use the XPath to find an element. That works as long as the webpage doesn’t
change.
The function below returns True if the element is found and False if it isn’t.
1 from selenium.common.exceptions import NoSuchElementException

2
3 def check_exists_by_xpath(xpath):
4 try:
5 webdriver.find_element_by_xpath(xpath)
6 except NoSuchElementException:
7 return False
8 return True
Chapter 5
Maximize window
If you want to maximize the browser window, you can call the maximize_window() function, do
this:

2 import time
3
6 time.sleep(5)
7
Minimize window
Even if a web browser is not maximized or visible, it can still continue to do tasks. That means you
don’t need to have the web browser open in full screen all the time, it can work in the background.
To minimize a web browser, you can use the script below:

2 import time
3
5 driver.minimize_window()
6
A minimized window is still visible in the task bar, it’s the same as pressing the minimize button.
Chapter 5 50
Hide window
You can hide the window completely. Unlike a minimized window that is visible in the taskbar, a
headless window is not shown anywhere. But you can still interact with it.
In headless mode everything still works, including clicking buttons, links, scrolling etc.
Firefox can be started in headless mode, by adding the options flag options.headless = True. The
program below opens a website while in headless mode and outputs the page title.
1 from selenium.webdriver.firefox.options import Options

3 import time
4
5 # Create profile wit headless option
6 fp = webdriver.FirefoxProfile()
7 options = Options()
8 options.headless = True
9 fp.set_preference("browser.acceptInsecureCer", True)
10
11 # Start browser using profile
12 driver = webdriver.Firefox(firefox_profile=fp, options=options)
13
14 # Open URL
16
17 # Show title
19 time.sleep(2)
Chapter 5 51
Window size and position

Selenium lets you set the window size and position. You can set the size with set_window_size()
where the parameters are width and height. The position on the screen can be set with set_window_-
position() where the parameters are the horizontal and vertical screen position.
The program below starts firefox, sets it on the top left and sets the window size.
1 from selenium.webdriver.firefox.options import Options

3 import time
4
5 # Open web browser
7
8 # Set position on screen. Top left is (0,0) until width and height
9 driver.set_window_position(0, 0)
10
11 # Set window size, width, height
12 driver.set_window_size(1024, 768)
13
14 # Open URL
16
17 # Show title
19
20 # Quit browser
21 time.sleep(2)
22 driver.quit()
Chapter 5 52
Tabs
Most modern web browsers support tabs. You usually have one or more tabs open, and you can let
Python to do the same things: open tabs and switch tabs.
To open a new tab, you can execute Javascript on the page. To switch tab, you can call switch_-
to.window().

2 import time
3 import os
5 from selenium.webdriver.common.action_chains import ActionChains
6
9 time.sleep(1)
10
11 # new tab
12 driver.execute_script('''window.open("http://github.com","_blank");''')
13
14 # switch tabs
15 time.sleep(1)
16 driver.switch_to.window(driver.window_handles[0])
17 time.sleep(1)
18 driver.switch_to.window(driver.window_handles[1])
19
20 time.sleep(5)
21 driver.quit()
Firefox web browser with several tabs

Chapter 5 53
Private mode
Web browsers like Mozilla Firefox and Google Chrome support “private browsing”. This means your
web browser does not save history, cookies and other data, meaning on your local computer there
are no traces.
However, it does not mean “privacy mode”. Websites can track a computer by ip address or use
techniques like browser finger printing or server side cookies. That means that even if you clear
your cookies, websites track you around using other techniques.
Cookies are files stored on the computer, that are often used for tracking and authorization
Many users often confuse this, thinking private mode or incognito mode means privacy, it does not.
It means it doesn’t store the data on your computer, that’s all. By default, a selenium browser starts
with a clean, new profile.
Chapter 5 54
Private mode in Firefox

In Mozilla Firefox, you can set the flag browser.privatebrowsing.autostart to True. This can be done
while starting the driver:

2
3 firefox_profile = webdriver.FirefoxProfile()
4 firefox_profile.set_preference("browser.privatebrowsing.autostart", True)
5
6 driver = webdriver.Firefox(firefox_profile=firefox_profile)
That’s the same as setting the private mode or opening a new private window.
Firefox web browser in private mode

Chapter 5 55
Private mode in Chrome

Chrome calls it incognito mode instead. The flag --incognito starts the browser directly in incognito
mode.

2
3 chrome_options = webdriver.ChromeOptions()
4 chrome_options.add_argument("--incognito")
5
6 driver = webdriver.Chrome(chrome_options=chrome_options)
7 driver.get('https://google.com')
Incognito mode and private mode are exactly the same, it doesn’t store data like cookies and history
locally.
Incognito window in Google Chrome

Automate Your Browser With Selenium

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Automate Your Browser With Selenium

Uploaded by

Copyright:

Available Formats

Sold to

This version was published on 2020-08-16

© 2020 Frank Anemaet

Control browser via WebDriver

1 pip install selenium

You may want to do this in a virtual environment (virtualenv).

Install on Ubuntu Linux

1 sudo apt-get -y install python3-pip

For Ubuntu Linux, you can follow the steps below.

1 pip3 install selenium

You must install the selenium driver too.

1 echo "Linux script, for Mac OS install geckodriver manually."

Make it executable with the command below

1 chmod 755 firefox.sh

Then run these commands:

1 chmod 755 chrome.sh

Installation output on Ubuntu Linux

1 echo "Web Driver install for Mac OS X (Firefox, Geckodriver)."

To install Selenium Firefox for Mac OS X:

1 pip3 install selenium

1 pip3 install selenium

Next the driver needs to be installed.

To install the Chrome version on Mac OS X use:

1 chmod 755 chrome.sh

Installation output on Mac OS X

1 from selenium import webdriver

1 from selenium import webdriver

1 from selenium import webdriver

Then we get open a website using Firefox.

So the driver is actually a link to the web browser.

1 from selenium import webdriver

1 from selenium import webdriver

Mimick browsing behavior

This way you let Python mimick web browsing behavior

1 from selenium import webdriver

To close the browser window programatically, use

1 from selenium import webdriver

Take screenshot automatically

So you can do the following:

1 from selenium import webdriver

Get HTML Source

1 from selenium import webdriver

The variable page_source contains the current webpage’s HTML content.

1 from selenium import webdriver

Take bulk screenshots

1 from selenium import webdriver

1 from selenium import webdriver

Each image will then be stored by a unique number.

Full page screenshot

1 python_button = driver.find_element_by_xpath("//button[contains(text(),'Click Me!')]\

You’ll notice we use the method

1 from selenium import webdriver

1 # close javascript popup

HTML Text input

1 from selenium import webdriver

1 from selenium.webdriver.common.keys import Keys

Then call return (not enter).

1 from selenium import webdriver

Then send keyboard keys into that textbox

1 from selenium import webdriver

1 from selenium import webdriver

Get HTML element text

1 from selenium import webdriver