101E IRT Module 2 Lesson 13 V12.0

101E IRT
Using the Internet as an

Investigative Research Tool™
Module 2 | Lesson 13
Lesson 13: HTML & Meta Tags
A huge amount of information can be gleaned as a result of the ability to examine the
structure and compilation of a website or webpage. Corroborative intelligence, such as
keywords, image files, and meta tag files can allow an investigator to determine important
facts about the intended audience and true purpose of a website or webpage. The ability to
see "behind" a website or webpage in this manner is a huge advantage to an online
investigator. This lesson goes "behind the scenes" to examine HTML and meta tags to
discover how this can progress an investigation in unexpected ways.
Upon completion of this lesson, students will:
• Understand the difference between markup tags and meta tags..
• Learn how to locate images and other files that may otherwise have gone untraced.
• Be able to identify hyperlinks to URLs and email addresses and locate hidden text
within a webpage.
This lesson should take no longer than 60 minutes to complete. If you have any questions
or require assistance, please contact the course instructor at training@toddington.com.
Chapters
1. Introduction 6. Source Code ‘Head’ & Meta Tags
2. HTML Meta Tags & Markup Tags 7. Video Tutorial
3. Images 8. Knowledge Review
4. Hyperlinks 9. Notes for Investigators
5. Invisible Content 10. Suggested Reading 
PAGE 1 of 21 V12.0 DURATION: 60 MINUTES

101E IRT

Introduction
To better understand how a webpage is created and indexed by search engines, it is

necessary to understand some of the basic elements that make up a Web-based document
or site. The primary building block of a typical webpage is Hypertext Markup Language or
HTML. In simplest terms, HTML uses a set of markup symbols, often referred to as tags,
that tell a Web browser how a particular page or document should be displayed.
Markup tags not only tell the Web browser how to display the words and images on the
page, they also contain the information required for hyperlinks to work properly. It is not
only possible, but often highly useful, to directly view a webpage’s HTML source code, as
this can provide valuable insight into the real purpose of the site, as well as establishing its
intended audience. HTML source code can be viewed in raw form through most Web
browsers.
For this lesson, a sample webpage has been created and is located at https://
www.toddington.com/course-materials/example.htm. Go to this URL and confirm that the
page is the same as the one illustrated below.

101E IRT

The HTML source code may be located within different menus, depending on the browser
used to access it. The Mozilla Firefox browser will be recommended throughout this
program as it compatible with different operating systems, including Windows and
Macintosh.
Using Mozilla Firefox
In order to access the source code for the https://www.toddington.com/course-materials/

example.htm page, open the page in the browser window and right click on the page; select
‘View Page Source’ from the options provided (illustrated below).

101E IRT

Selecting this option will open the HTML source code window illustrated below.
Using Firefox, a page’s HTML source code can also be accessed via the dropdown Tools
menu. Within the dropdown menu, select ‘Web Developer’ followed by the ‘Page Source’
option.

101E IRT

Using Microsoft’s Edge Browser
In Windows version 10, Internet Explorer was replaced with Microsoft’s Edge browser and is
as simple-to-use, with familiar keyboard short cuts and other Windows-based features.
Before we can view a page’s source code, we will need to enable this option from within the
browser’s ‘Developer settings.’ To access these settings, type “about:flags” in the Edge
search bar. As illustrated below, ensure the ‘Show “View source” and “Inspect element”
in the context menu’ option is selected.
Similar to Firefox above, with the https://www.toddington.com/course-materials/

example.htm page open in an Edge browser window, right click on the page and select the
‘View source’ option (illustrated below).

101E IRT

As illustrated below, the F12 Developer Tools page will be displayed; by default, the
Debugger tab will be shown.

101E IRT

The ‘F12 Developer Tools’ can also be accessed from within the ‘More’ menu in the top
righthand corner of the screen (illustrated below), or by pressing the fn + f12 shortcut keys
on your keyboard.

101E IRT

Using Internet Explorer
If you are using an older version of Windows that is bundled with the Internet Explorer
browser, or simply prefer to use this browser over Edge, open the page https://
www.toddington.com/course-materials/example.htm, right click anywhere on the page and
select ‘View source’ (illustrated in the screen shot below).
Similar to the Edge browser, selecting this option will open the F12 Developer Tools
window, with the Debugger window displaying the HTML source code by default. A page’s
source code can also be viewed via the Internet Explorer drop down Tools menu, by
selecting the F12 Developer Tools option; using the keyboard shortcut keys fn + f12 will
also produce the same page.

101E IRT

Using Chrome
Using the Chrome browser, a site’s HTML source code can be retrieved in the same manner
as the other browsers demonstrated in this lesson, by right clicking on the https://
www.toddington.com/course-materials/example.htm page and selecting ‘View Page
Source.’

101E IRT

In order to access a page’s source code via Chrome’s dropdown menu, hover your mouse
over the More tools option provided, and then select Developer tools (illustrated below).
Using Safari
Finally, using Safari, right clicking on the example page and selecting ‘Show Page Source’
will reveal the page’s source code.
By default, the ‘Show Page Source’ option is not enabled in Safari. In order to be able to
view this option, select ‘Safari’ from the toolbar, then select ‘Preferences.’

101E IRT

From the preferences window, select the ‘Advanced’ tab, and ensure the ‘Show Develop
menu in menu bar’ checkbox is ticked. You will now be able to view the ‘page source’
option in the right-click menu.

101E IRT

A page’s source code can also be viewed in Safari by selecting ‘Show Page Source’ from
within the Develop drop down menu, as illustrated below.

101E IRT

Note: The following demonstration will be conducted using the Mozilla Firefox browser.
Functionality will not differ from browser to browser.
HTML Meta Tags & Markup Tags
All HTML source code documents consist of a ‘head’ and a ‘body,’ and are constructed using
‘markup tags’ and ‘meta tags.’
A markup tag is easily recognizable as it is encapsulated by triangular brackets and

generally formed in two parts that surround an instruction. Text that appears on the
webpage will usually be contained between two markup tags. For example, the 
markup tag represents a paragraph. A paragraph will open with the markup tag and
close with the markup tag. A one-sentence paragraph would appear like this:
This is an example one-sentence paragraph.
When viewed through a browser, the markup tags would not be visible; only the text
contained between the tags would appear. Markup tags also contain instructions relating
to the font type, colour, size, and spacing of the text contained within the markup tags.
Carefully examine the HTML source code of the example page and note the following
instructions and markup tags in relation to the text displayed on the webpage.
Instruction on how to align the sentence within the document is located here.
The specific font-face for this text is displayed here.
The size and type-face of the text (i.e., bold) is shown here.

101E IRT

The text of the page is identifiable within the body of the source code. In the example
above, the text reads:
“This is a picture taken of a rural English train station.”
Images
Images displayed on Web pages are stored separately from the HTML document and
recalled as the page is “built” within the browser. The example below shows the location of
the train station image, along with instructions to the browser as to the size of the image
and how it should be aligned (in this case, centred).
The ability to identify an image within the HTML source code can be a significant advantage
in an investigation as more detailed information can be obtained about the domain being
investigated; in the case of this example, the domain is www.toddington.com.
By separating the above information into three distinct sections (detailed below), it may be
possible to locate additional information relating to this domain that may be difficult to
otherwise ascertain.
1. Images
This section signifies the name of a file within the toddington.com domain. With this
information, it may now be possible to access the file called “images” within the
toddington.com domain as we now know that this file exists. This file would theoretically be
located at https://www.toddington.com/images.

101E IRT

Depending on the way in which the site has been configured and administered, it may be
possible to access this file and locate other data contained therein. Although the discovery
of this file may be significant, public users may be prohibited from accessing certain files
within particular domains due to password protection or other restrictions.
2. Station
This section is simply the name of the data, document, or image contained within the file
named “images.” In this case, the image is named “station.”
3. JPG
This is the file extension and indicates what type of data or information “station” is. In this
case, “station” is a “JPEG” image.
To view this image in its original location, you would enter the following into your browser’s
address field:
www.toddington.com/images/station.jpg
Please note the above example images folder is not accessible on the Toddington International
corporate site.
It is important to note that embedded within the image on the example page is a line of text
which cannot be located anywhere within the HTML source code (see text highlighted
below).

101E IRT

This is an excellent way to include text within a webpage or site, and yet prevent it from
being located, crawled, and indexed by search engines and other automated applications.
The text is considered to be an integral part of the image and, therefore, is not embedded
in the website, but rather within the image itself.
Hyperlinks
The paragraph shown below contains a line of text encapsulated within markup tags and an
email address contained within triangular brackets. Such as indicates a new paragraph,
<a> indicates the use of a hyperlink.
Reading the above paragraph from left to right, the following can be ascertained:
 denotes a new paragraph containing the text: “If you’d like more information about
this picture email me!” The text is aligned in the centre and the size is coded as “big.”

101E IRT

Within the <a> brackets is “mailto:trains@toddington.com,” followed by “>email</a> me!“

The word “email” is enclosed within brackets, indicating that this is the hyperlinked word
on the webpage.
Examine the actual webpage to view the interpretation of this code by your Web browser.
The paragraph below is very similar in composition to the example shown above; however,
the markup tags contain a URL (http://www.railtrack.co.uk) rather than an email address.
Note again that the hyperlinked word, in this case the word “here,” is contained within
brackets.
Invisible Content
The final paragraph in the body of the HTML source code is completely invisible on the
example Web page. The text reads:
“The text you are now reading is invisible when viewed through a browser as the text is the
same colour as the background… Not a bad way to hide text that can be easily viewed by
someone who knows where and how to find this hidden message.”

101E IRT

This has been achieved by presenting the text in the same colour as the background of the
webpage, rendering it effectively invisible to anyone unaware of its existence. This is an
excellent method of “hiding” information in plain view on the Web.
You can highlight the example “invisible” text by holding down the left button of your mouse
and dragging the cursor over the page. The hidden text will be revealed, as highlighted in a
pale blue colour below.
Source Code ‘Head’ & Meta Tags
At the top of the HTML source code document, the meta tags and related information are
contained within the <head> and </head> tags (illustrated below).
The ‘head’ can be divided into a manageable list of meta tags, each of which contains a
description of its content and purpose. Not all HTML source code is compiled in this simple

101E IRT

format, and students will need to spend time examining the HTML source code of a variety
of websites to gain a better understanding of the more complex types of HTML.
meta NAME=“Description”
This meta tag gives a natural-language overview of the webpage to which it relates (in this
case https://www.toddington.com/course-materials/example.htm) and enables those
search engines that support description tags to return this text in its search result
description.
meta NAME=“KeyWords”
This meta tag shows the keywords that have been chosen to describe this site and will be
used by many of the major search engines to assist in determining the relevance of a page
to queries containing these keywords. The keywords in this example are: trains, train,
railways, railway, stations, rural, United Kingdom, England, and Britain.
Keywords that accurately describe a page but may not be included in the actual text of the
document are often listed, although there are many instances on the Web of unscrupulous
authors filling a meta tag with irrelevant keywords in an attempt to attract more traffic.
Notice how some of the keywords are listed in both root and plural form.
meta NAME=“ROBOTS”
The final meta tag “ROBOTS” relates to the file exclusion standard and will be covered later
within this module of the course.

101E IRT

The example page examined is a very simple HTML document. Many pages on the Web
incorporate numerous advanced features, including frames (multiple documents being
displayed as one page), embedded multimedia (i.e., shockwave animation, audio), and
other applications (i.e., Java applets).
Take the time now to examine the HTML source code of a number of pages you
frequently visit and note the differences.
Video Tutorial
This section contains a video tutorial by our training team that demonstrates how to locate
and view a website’s source code. We recommend students take a few minutes to view the
following video before proceeding to the next section: https://youtu.be/Rf-7bu9jGlY.
Knowledge Review
Knowledge Reviews are designed to assist with information retention and do not form part
of the overall grade for this or any other module. There may be more than one correct
answer for some of the questions, and often, there will be no “correct” or “incorrect”
answer.
Students should now complete the Knowledge Review relating to this lesson — HTML & Meta
Tags. This review is located within Lesson 13 of Module 2 on the training site homepage, or
here: Knowledge Review: HTML & Meta Tags.
Please contact the course instructor at training@toddington.com with any questions

relating to this Knowledge Review or lesson.
Notes for Investigators
Often times, when conducting a search, the search engine may indicate that the keywords
you searched for are included in the results, but when you view a specific result, the
keywords are not available in the visible page content. In such instances, you will need to
review the ‘page source’ content. As examples, this method can be used to locate unique

101E IRT

identification numbers of an account, the actual URL of an image embedded in a page, or a

photo credit for a photo included in an article; these items will not be apparent in the
normal page view.
Furthermore, websites dealing with illegal content, such as selling counterfeit identifications
(fake IDs) or products (i.e., replica handbags, shoes), may use the keyword section in the
source code to direct traffic to their site, as opposed to overtly disclosing the purpose of
their website on the actual homepage.
Suggested Reading
• https://www.w3schools.com/html
• https://html.com
• https://www.computerhope.com/issues/ch000746.htm
©Copyright 2021 - Toddington International Inc. All Rights Reserved.

Duplication of the materials within this publication without express permission is prohibited.
www.TODDINGTON.com

101E IRT Module 2 Lesson 13 V12.0

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

101E IRT Module 2 Lesson 13 V12.0

Uploaded by

Copyright:

Available Formats

101E IRT

Using the Internet as an

Lesson 13: HTML & Meta Tags

Upon completion of this lesson, students will:

• Understand the diﬀerence between markup tags and meta tags..

1. Introduction 6. Source Code ‘Head’ & Meta Tags

2. HTML Meta Tags & Markup Tags 7. Video Tutorial

3. Images 8. Knowledge Review

4. Hyperlinks 9. Notes for Investigators

5. Invisible Content 10. Suggested Reading

PAGE 1 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

To better understand how a webpage is created and indexed by search engines, it is

PAGE 2 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

Using Mozilla Firefox

In order to access the source code for the https://www.toddington.com/course-materials/

PAGE 3 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 4 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

Using Microsoft’s Edge Browser

Similar to Firefox above, with the https://www.toddington.com/course-materials/

PAGE 5 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 6 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 7 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

Using Internet Explorer

PAGE 8 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 9 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 10 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 11 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 12 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

HTML Meta Tags & Markup Tags

A markup tag is easily recognizable as it is encapsulated by triangular brackets and

<p>This is an example one-sentence paragraph.</p>

The speciﬁc font-face for this text is displayed here.

PAGE 13 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

“This is a picture taken of a rural English train station.”

PAGE 14 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 15 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 16 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

Within the <a> brackets is “mailto:trains@toddington.com,” followed by “>email</a> me!“

PAGE 17 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

Source Code ‘Head’ & Meta Tags

PAGE 18 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

PAGE 19 of 21 V12.0 DURATION: 60 MINUTES

Using the Internet as an

Please contact the course instructor at training@toddington.com with any questions

Notes for Investigators

PAGE 20 of 21 V12.0 DURATION: 60 MINUTES

5. Invisible Content 10. Suggested Reading