Professional Documents
Culture Documents
Computing Fundamentals
||||||||||||||||||||
||||||||||||||||||||
In this book we will be taking an in-depth look into how computers function. These foundational concepts are often the first things taught in a Computer
Science degree, and for good reason. Although it is possible to enter the security industry without understanding these concepts, students who do so will
often lack the knowledge that more advanced concepts are based on. This often shows through on more advanced courses, where the attacks start to take
advantage of low-level computing concepts. This can cause students to struggle later on.
||||||||||||||||||||
||||||||||||||||||||
Components of a Computer 13
Motherboard 14
Storage 17
What Is a Kernel? 21
What Is a Process? 22
What Is an interrupt? 23
Boot Loaders 24
Bits 27
||||||||||||||||||||
||||||||||||||||||||
Counting 28
Counting in Binary 30
Counting in Hexadecimal 32
Negative Numbers 35
Two’s Complement 38
Notation 39
Virtualisation 46
||||||||||||||||||||
||||||||||||||||||||
Virtualisation in Security 49
Virtualisation Applications 50
Types of Networks 74
Network Hardware 75
OSI Model 80
||||||||||||||||||||
||||||||||||||||||||
IP Addresses 89
IP Addresses: IPv4 90
IP Addresses: IPv6 91
IP Addresses: Netmask 94
IP Addresses: CIDR 95
||||||||||||||||||||
||||||||||||||||||||
IP Address: Broadcast 97
MAC Addresses 98
ARP 99
DNS 101
DHCP 104
Packets 105
Protocols 106
Ports 107
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
ICMP 126
Encoding 133
Encoding 137
Encryption 139
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
Hardware Components
Hardware Components
||||||||||||||||||||
||||||||||||||||||||
Components of a Computer
Components of a Computer
Of course there are additional components such as the power supply - but the key ones are discussed here.
||||||||||||||||||||
||||||||||||||||||||
Motherboard
Motherboard
The motherboard connects the components of the computer together. The motherboard contains Read-Only Memory (ROM) on which the firmware that
runs the computer lives. Firmware is defined as software that is programmed onto read-only memory. The two main types of firmware found on
motherboards are BIOS (Basic Input Output System) and UEFI (Unified Extensible Firmware Interface).
Older motherboards use BIOS or Basic Input Output System, while newer motherboards use UEFI or Unified Extensible Firmware Interface for the boot
process. The boot process is also known as ‘Bootstrapping’. It is the process of loading a small program into memory which in turn loads other programs
(including the Operating System). In addition, the boot process usually involves a series of tests to make sure hardware is functioning correctly.
||||||||||||||||||||
||||||||||||||||||||
The CPU is the ‘brain’ of the computer. All tasks performed by the computer start at the CPU, which then sends signals to other components such as the
Hard Disk.
Each processor can perform only a single task at a time, although it often doesn’t feel like it. You might be listening to music playing on your computer at
the same time as writing an e-mail. Although you are performing both tasks simultaneously, this is only an illusion. A processor can perform a massive
number of tasks per clock cycle. By switching back and forth between different tasks, a processor gives users the impression that it is doing multiple things
at once. Computers perform tasks much quicker than humans and therefore we often cannot tell the difference.
There are CPUs available with multiple ‘cores’. Each core is actually a small processor, so such CPUs are in fact capable of performing multiple tasks at
the same time. This is done by simply assigning each task to its own processor. Note the difference between the usage of CPU and Processor. A CPU refers
to the unit that is fitted to a computer, while there might be several processors or processing cores on a single CPU.
||||||||||||||||||||
||||||||||||||||||||
Random Access Memory is a fast form of data storage. It is orders of magnitude faster than accessing hard disk storage.
Random Access Memory is volatile memory. The data stored on RAM rapidly degrades once power is no longer supplied to the chips. This means that
RAM does not usually survive a reboot.
Because RAM is much faster than Storage, programs are read from storage and loaded into RAM while they are running. This means the CPU can access
the instructions contained in the program more quickly. We’ll be looking more into how memory is laid out later in this book.
||||||||||||||||||||
||||||||||||||||||||
Storage
Storage
Storage is a non-volatile data store - the data contained does not degrade when no power is supplied. However, it is much slower than accessing RAM:
which is why typically data is read from Storage and loaded into RAM. This type of storage is most often a hard drive or SSD – however, it can also be
physical media such as a Blu-Ray or even DVD (if anyone still remembers those!).
A traditional hard drive is mechanical. It contains several platters – circular metal disks on which data is stored. An actuator arm moves the read/write head
as the platters spin. Data is encoded onto the platter using magnetic charge. A positive charge indicates a binary 1; a negative charge indicates a binary 0.
An SSD (Solid State Drive) on the other hand has no moving parts and therefore is considered to be somewhat more reliable than traditional mechanical
hard drives. They are also faster at reading data than mechanical hard drives. SSD drives have their own challenges, however. They are often rated for a
specific number of read or write operations and they will degrade beyond that point. Originally that number was feared to be quite low, but recent studies
have shown that the number is higher than anyone expected.
||||||||||||||||||||
||||||||||||||||||||
The graphics card is responsible for processing graphics. It can be integrated into the CPU or Motherboard; however, more common these days is to have
the graphics card as a discrete component. GPUs are highly specialised processors that excel in number crunching tasks. This makes them perfect for fast
encryption tasks as well as for graphics - and therefore you often see them being used for tasks such as bitcoin mining, password cracking and gaming.
||||||||||||||||||||
||||||||||||||||||||
Software
Software
||||||||||||||||||||
||||||||||||||||||||
The operating system is software that manages hardware and software resources. Common operating systems include:
The operating system acts as a bridge between software and hardware. Most applications require an Operating System to run.
||||||||||||||||||||
||||||||||||||||||||
What Is a Kernel?
What Is a Kernel?
The kernel is a critical component of an operating system. It is the first thing to load on start up, and the core of an operating system. It is responsible for:
The kernel is loaded into a protected area of memory known as Kernel Space. On the other hand, a user’s actions on the OS take place in user space, which
is a separate area of memory altogether.
||||||||||||||||||||
||||||||||||||||||||
What Is a Process?
What Is a Process?
When a computer program is executed, the instructions are read from Storage and loaded into Memory, ready for the CPU to start executing the
instructions. The instance of a computer program that is running in memory is known as a ‘process’. The process contains the code for the program, the
process state and its activity.
||||||||||||||||||||
||||||||||||||||||||
What Is an Interrupt?
What Is an Interrupt?
An interrupt is a signal that can be sent to the processor from hardware or software components. It ‘interrupts’ the normal flow of program execution, so
that an important event can be handled by the processor. The processor saves the state of its current task and calls an interrupt handler to deal with the
event.
Once the event that caused the interrupt has been dealt with, the processor retrieves the saved state of what it was doing before and continues.
Typical examples of interrupts would be pressing ctrl + alt + delete on Windows: this is a software interrupt.
Similarly, typing or moving the mouse generates an interrupt: this is a hardware interrupt. Imagine if typing or moving the mouse did not cause an
interrupt. You wouldn’t be able to see the mouse cursor move until the processor had finished dealing with whatever it was doing. The cursor position
would be updated suddenly, and you wouldn’t be able to see the cursor move again until the processor had time to update its position on your screen.
||||||||||||||||||||
||||||||||||||||||||
A boot loader is a small computer program which starts when a computer is booted up. The boot loader performs tasks such as hardware diagnostics, and
also knows which disk to look at for an Operating System. Then the Operating System will be loaded into Memory by the boot loader, ready for the
processor to start executing those instructions.
This process is also known as ‘bootstrapping’, a reference to ‘picking yourself up by your own bootstraps’. The idea is that a smaller program is loaded in
order to load a larger program.
||||||||||||||||||||
||||||||||||||||||||
The boot loader controls things such as which disk to look for the OS software on first (boot order), RAM timings, processor clock speed, etc.… All of
these settings can be tweaked (at your own peril!) in the BIOS settings. To get to the BIOS settings, you need to follow the instructions when the computer
first starts. It may say something like Press ‘del’ to enter SETUP. The exact key to hit to enter the BIOS settings screen depends on the BIOS itself and can
vary from computer to computer.
||||||||||||||||||||
||||||||||||||||||||
Numbering Systems
Numbering Systems
||||||||||||||||||||
||||||||||||||||||||
Bits
Bits
A bit is the smallest value that can be represented by a computer. They are the ‘building blocks’ of which everything is made up.
A bit is a Boolean value - in other words, it can only have two possible values: True or False. This can also be represented as 1 or 0.
Each group of 8 bits is called a ‘byte’, and a group of four bits is called a ‘nibble’ (Half a byte is a nibble. Get it?)
||||||||||||||||||||
||||||||||||||||||||
Counting (1)
• Before we can look into Bits in any more detail, we need to re-learn
how to count
• Counting is easy because we learn it from such a young age - we
can do it without giving it any thought at all - but it is important to
understand how it works
• Humans count in Base 10 (Decimal). That is, we start from 0 and
then count up until 9. When we get to 9 we’ve run out of numbers
that we haven’t used. Therefore we add a 1 to the left of the number
and roll the 9 back around to a 0.
Counting (1)
Before we can go any further in this course we have to re-learn how to count. We’ve all been counting since a very young age - it’s almost as natural as
breathing - and this is kind of a problem. That’s because we don’t really think about the process of counting any more.
Humans count in Base 10 - also known as Decimal, or Denary. This means we start from the number 0, count up to 9, and then stop. We stop because
we’ve run out of digits to use. (10 isn’t a digit, it’s two digits). Because we’ve run out of digits to use, the 9 becomes a zero and we put a 1 to the left of it.
We know this to be 10.
||||||||||||||||||||
||||||||||||||||||||
Counting (2)
Counting (2)
Unlike with Binary and Hexadecimal, which you’ll see in a moment, it’s easier for humans to read these numbers from left to right.
||||||||||||||||||||
||||||||||||||||||||
Computers don’t count in Denary - they count in Binary. Binary is known as Base 2, but the same principle applies to it as Base 10.
In Binary there are only two possible digits: 0 and 1. Once you reach 1, it becomes a 0 again - and a 1 is placed to the left of the value. So let’s count to 15
in binary:
0, 1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111
Did you ever hear the old joke, ‘There are 10 types of people in this world: those who understand binary and those who don’t”? Well, now you should
understand it.
Technet24
||||||||||||||||||||
||||||||||||||||||||
• It’s easiest to read binary from right to left as the smallest value is
on the right
• Let’s look at it this way:
So then we add up the numbers that we do have to get the decimal value.
In this case:
2 + 8 + 32 = 42
||||||||||||||||||||
||||||||||||||||||||
Binary is inconvenient. Even if humans understand it, the numbers become too long and take up too much space to write or remember easily. To get around
that we use Base 16, or Hexadecimal, as a kind of shorthand.
Hexadecimal has a problem, however. Since humans count in Base 10, we don’t have digits that can represent values beyond 9….
Technet24
||||||||||||||||||||
||||||||||||||||||||
To get around that, we use letters to represent digits greater than 9. We go from 0 – 9 as usual, then we use A – F to represent the other digits.
Of course, just because we use the same digits does not mean the value is the same as in Denary. Take for example ‘10’.
If it is 10 in hexadecimal, then the true value in denary would be 16. This shows how important it is to know which number system is being used when you
are counting.
||||||||||||||||||||
||||||||||||||||||||
This is very similar to how we looked at Binary numbers: the only difference is that in Binary you either had a number or you didn’t – but here you can
have multiples of numbers. For example, there is one 16 here but there are four 4096s.
We have 15 1s.
We have one 16.
We have ten 256s.
We have four 4096s.
Technet24
||||||||||||||||||||
||||||||||||||||||||
So we’ve seen how to represent positive numbers in Binary, but where do negative numbers come in?
A whole number is called an ‘integer’. To represent a negative number we need to use a ‘sign bit’. When an integer contains a signed bit, it is called a
‘signed integer’ and it can represent either positive or negative whole number values.
Of course, if we use a sign we lose one bit that becomes the sign bit - that is usually the MSB or Most Significant Bit (the one with the highest value). For a
32-bit Integer value that means we go from being able to represent 0 - 4,294,967,295 values to −2,147,483,648 to 2,147,483,647.
||||||||||||||||||||
||||||||||||||||||||
To understand how this works, we need to look at how negative numbers are calculated. The leftmost bit, or the MSB (Most Significant Bit), is the sign bit.
The bit is a 1 so that indicates the number is negative.
Technet24
||||||||||||||||||||
||||||||||||||||||||
We go to the first ‘1’ starting from the right. Then we invert (make a 1 into a 0 and a 0 into a 1) every digit to the left of that first 1.
0000001 in Denary is 1.
We know because of the sign bit that the number is a negative, so therefore its value is -1.
||||||||||||||||||||
||||||||||||||||||||
Two’s Complement
Two’s Complement
Two’s complement is a way of turning a positive number into a negative number or a negative number into a positive number (negating it).
Take the binary number: 000000110. As a signed integer, this has the denary value 6. If we want to represent -6 we could use two’s complement. First,
from the right go to the first 1, then from the left of that 1, invert all the digits. We end up with: 111111010. The first digit on the left is the sign bit,
indicating it is now negative. Re-apply two’s complement ignoring the left most sign bit and you get: 0000110 which is 6. So we can see how we went
from 6 to -6 and then back.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Notation
Notation
We’ve covered a lot to do with numbers. Perhaps the most confusing part is: how do we know which number is written, and in what numbering system?
We already know that 10 in Denary, 10 in Binary and 10 in Hexadecimal all have totally different values.
We use notation to tell the difference. The standard when you are using multiple numbering systems is to write them with notation.
||||||||||||||||||||
||||||||||||||||||||
Exercises
Binary to Decimal
3. 10010100: __________________________
1. 11011101: __________________________
4. 11100011: __________________________
2. 11100101: __________________________
Decimal to Binary
2. 192: _________________________________
1. -42: __________________________________
2. -91: __________________________________
||||||||||||||||||||
||||||||||||||||||||
Exercise Answers
Binary to Decimal
3. 10010100: 148
1. 11011101: 221
4. 11100011: 227
2. 11100101: 229
Decimal to Binary
3. 681: 001010101001
Convert the following decimal numbers into binary:
4. 1024: 010000000000
1. 59: 111011
2. 192: 11000000
3. -22: 11101010
Two’s Complement
4. -3: 11111101
Technet24
||||||||||||||||||||
||||||||||||||||||||
1. -42: 11010110
2. -91: 10100101
||||||||||||||||||||
||||||||||||||||||||
Operating Systems
Operating Systems
Technet24
||||||||||||||||||||
||||||||||||||||||||
Microsoft has a very large market share in the Desktop Operating System Market. Versions prior to Windows Vista are no longer supported by patches and
updates which means that they can’t be considered secure.
Windows Server 2003 and prior are also no longer supported by patches and updates and should no longer be used.
Server Core is a more recent release from Microsoft: it is a stripped down version of Windows Server. With Server Core, you can only access the
command prompt and a very limited set of GUI features (such as registry editor).
Windows Server Datacentre edition is not available without a support contract from Microsoft.
||||||||||||||||||||
||||||||||||||||||||
Linux comes in many different distributions. They all share one thing in common, however, and that is the Linux Kernel. Many distributions are tailored
for either Server use or Desktop use. Desktop versions tend to come with a GUI (Graphical User Interface), whereas Linux server distributions tend to have
everything unnecessary stripped out, including the GUI. Linux is most commonly used as a server OS, but recent changes in usability have started to shift
this perception, with distributions such as Ubuntu aiming to make Linux easier to use for non-technical users.
Technet24
||||||||||||||||||||
||||||||||||||||||||
The Mac Operating System is mainly seen in Desktop form, although there is a server OS. The Server is not commonly seen. The Mac platform on the
whole is based on Unix, which is similar in many ways to Linux. This means that the Mac Operating System is underpinned with similar capabilities to
many Linux distributions, including a Terminal and many terminal commands which will seem familiar to Linux users.
||||||||||||||||||||
||||||||||||||||||||
Virtualization
Virtualization
Technet24
||||||||||||||||||||
||||||||||||||||||||
Virtualization
Virtualization
Virtualization is the process of using software to mimic hardware components. These components are ‘virtual’; however, they behave like real computer
hardware. We can install an Operating System on these Virtual Machines as if they were physical computers.
The benefit of using Virtual Machines is that they are isolated and cannot interact with the host running the virtualisation software. This is particularly
useful for high risk activities such as malware analysis (oops, I double clicked it by mistake!). Make sure you disable the network card in the virtual
machine if you’re going to do that, though!
It’s also incredibly useful to be able to run multiple operating systems on one computer without having to restart to switch between them. There are many
security tools which only run on one particular operating system, so you’ll be using virtualisation a lot in the security industry.
||||||||||||||||||||
||||||||||||||||||||
• Host OS
• Guest OS
OS-Level Virtual Machines are the most common type of Virtual Machine. It allows multiple Operating Systems to run virtually on the same hardware.
With OS-Level Virtual Machines, the Operating System that runs the Software that creates the Virtual Machines is known as the ‘Host OS’. The Operating
Systems that run on the Virtual Machines are known as the Guest OS(es).
Application-Level Virtual Machines allow an application to run in a separate Virtual Machine. This means that should the application be compromised the
whole computer cannot be taken over.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Virtualization in Security
Virtualization in Security
Virtualization is extremely important for security people. We often need both Windows and Linux at the same time. Of course, dual-boot (the practice of
installing two operating systems on one hard disk) is an option but in that case you would have to reboot to switch back and forth. With virtualization we
can run both Windows and Linux at the same time without the inconvenience of having to reboot every time you need to swap between them.
Virtual machines have an additional benefit of being isolated from the host. This is particularly good for malware analysis (but do yourself a favor and
disable the networking in the virtual machine so the malware doesn’t spread over the network). Additionally, many Virtualisation applications support
snapshotting which allows you to save the state of a virtual machine at a particular point in time (like when you have a known good configuration) so that
you can revert to the snapshot and have a working Operating System without having to re-install the Operating System.
||||||||||||||||||||
||||||||||||||||||||
Virtualization Applications
• VMware
• VMware Workstation (Windows / Linux) (Paid)
• VMware Player (Windows / Linux) (Free)
• VMware Fusion (Mac) (Paid)
• VirtualBox (Windows / Linux / Mac) (Free)
• In this course we will be using VMware. You can use any of the
various versions listed above.
Virtualization Applications
There are many applications out there that support virtualisation. We’re going to be using VMware in this course, but you should be able to use VirtualBox
also if you prefer so long as you translate the steps to your application.
Technet24
||||||||||||||||||||
||||||||||||||||||||
When you first open VMware you will be greeted by the home screen where you can choose to create a new virtual machine or open an existing one.
||||||||||||||||||||
||||||||||||||||||||
Technet24
||||||||||||||||||||
||||||||||||||||||||
Existing Virtual Machines can be found to the left under ‘Home’, if you don’t have any yet it will be blank. You can create a new virtual machine by
clicking on ‘Create a New Virtual Machine’ or from the Player dropdown menu in the top left.
||||||||||||||||||||
||||||||||||||||||||
Technet24
||||||||||||||||||||
||||||||||||||||||||
From the Virtual Machine creation wizard you can specify the installation media you want to use to install the Operating System when the Virtual Machine
is created. You can use a DVD or Blu-Ray disk in your physical disk drive if you have one, or you can specify an image file such as an ISO that was
downloaded to your computer. You can also choose to install the OS later. If you go that route, the Virtual Machine will only have a BIOS on it. You will
need to install the OS by mounting a Virtual disk drive to it and mounting an ISO image to that or mirroring the virtual disk drive to your physical disk
drive.
||||||||||||||||||||
||||||||||||||||||||
• After selecting the installation media, you may have to tell VMware
what OS you are installing if it cannot detect it automatically
• If it does get detected, you may be given the option of ‘easy install’
which is where you tell VMware what settings you want to use such
as product key, username, password, etc.
• Then VMware will install the OS with those settings without you
having to do anything
Once you have specified the installation media, VMware will need to know what Operating System you are planning to install on it. Sometimes it can do
the detection automatically, but if automatic detection fails you will need to pick the right OS option from the dropdown. VMware has a convenient feature
called ‘easy install’ which is available for some Operating Systems (such as the Ubuntu Linux distribution) depending on which one you are trying to
install. With easy install, VM\ware will ask you for all the information the OS installer will ask for, and then it will fill it in automatically for you when it
starts the installation. This is very convenient if you want to leave it to install and come back later. If your OS does not support easy install, you will have
to follow the installation steps to install the OS just as if you were installing that OS on a real computer.
Technet24
||||||||||||||||||||
||||||||||||||||||||
You will have to specify where you want to save the files for the Virtual Machine.
||||||||||||||||||||
||||||||||||||||||||
You’ll have to specify how much hard disk space you want to assign, and if you want to split the disk up into multiple files. This is useful if you want to be
able to copy the Virtual Machine onto other disks and carry them around with you. It’s easier to copy several smaller files than one very large file. Splitting
the disk does make the virtual disk somewhat slower than if you keep it in one large file, but in most cases the convenience of splitting the disk up
outweighs the minor speed decrease.
Technet24
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
VMware will configure the rest of the hardware for you with what it believes to be reasonable defaults for the Operating System you are installing. For
example, Windows usually gets 2GB of RAM at least while most Linux distributions will be given 512MB. You can customise these defaults, and we
recommend you do so to increase RAM if you have enough spare on your computer.
Technet24
||||||||||||||||||||
||||||||||||||||||||
When customising your VM hardware, make sure your computer is powerful enough to handle it when you assign more resources to a virtual machine.
Don’t assign 8GB of RAM to your VM if your host only has 8GB of RAM in the first place, your host needs some RAM to function too. Similarly if you
have 4GB of RAM to spare on your host, don’t run two VMs at once that both have 4 GB of RAM. Windows isn’t happy about having less than 2GB of
RAM either so you need to account for that.
||||||||||||||||||||
||||||||||||||||||||
Technet24
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
You can also change the CD settings, although you shouldn’t if your OS has yet to be installed. You can see here our Windows installation disk image is
loaded into the CD tray so that will run when we next boot the VM so that Windows can install.
Technet24
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
• The network adapter settings are perhaps the most important for
us
• Bridged Mode – In this mode, the VM will connect directly to the
network your host computer is connected to. It will have its own
individual IP address separate from the host OS.
• NAT Mode – In this mode, the VM will connect to the network
through the host computer. It will share the same IP address as the
host.
• Host Only Mode – In this mode, the VM will only be able to
connect to the host OS
The network adapter settings are a critical area to understand. Many people have been tripped up getting their Virtual Machines networked properly. The
three network adapter modes are:
Bridged Mode – In this mode, the virtual machine will connect directly to the network your host computer is connected to. It will have its own individual
IP address separate from your host computer.
NAT Mode – In this mode, the virtual machine will connect to the network through your host computer. It will share the IP address of the host, and have its
own separate internal IP address on a different subnet.
Host Only Mode – In this mode, the virtual machine will only be able to connect to the host OS and any other virtual machines that are running on the host
in host only mode.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Once the hardware is set, the VM can be created by clicking on finish. A new window will appear with the Virtual Machine within and the installation
process for the OS should start once the machine has booted.
||||||||||||||||||||
||||||||||||||||||||
Technet24
||||||||||||||||||||
||||||||||||||||||||
Once we have the VM running, we can use it like any other computer. It’s a good idea to install VMware Tools on the Virtual Machine’s Operating
System. VMware Tools will make it so that the resolution of the virtual machine automatically resizes based on the size of the window. It also has
convenient functionality like sharing a clipboard between the host and the guest operating systems, and it supports drag and drop of files.
||||||||||||||||||||
||||||||||||||||||||
• You can find ‘Install VMware Tools’ in the Player dropdown under
‘Manage’
• By clicking on it, a disk will automatically be mounted to the VM
• The disk will contain the VMware tools installer
• This also works under Linux – however, it is usually recommended
to use the Linux package manager to install the ‘open-vm-tools-
desktop’ package for your Linux distribution. The generic installer
is often problematic under Linux.
You can find ‘Install VMware Tools’ in the Player dropdown under ‘Manage’. By clicking on it, a disk will automatically be mounted to the VM. The disk
will contain the VMware tools installer. On Windows it’s a simple case of running the installer and rebooting.
There is also a copy of VMware Tools on Linux. You can mount the image the same way as you do in Windows, but it’s often better to use the package
manager included in your Linux distribution to install ‘open-vm-tools-desktop’. This package is usually optimised for that specific Linux distribution and it
can solve many problems the generic version of VMware Tools has.
Technet24
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
||||||||||||||||||||
• At the top of the VM screen you might see a pause button. This is
the button that suspends the VM.
• Suspension isn’t the same as a shut down. A suspended VM is
frozen in time at the moment it was suspended.
• When you come back to it, you can resume the VM and you can
continue what you were doing before it was suspended
• Clicking on the down arrow next to the pause button allows you to
select other options such as shut down or restart
• Shut down is similar to holding the power button down on a
computer to force it to switch off, so beware. It’s usually better to
shut down from within the VM.
SEC201 | Intro to Computer Fundamentals 72
The pause button at the top of the VM screen can be used to suspend the virtual machine. This is different from shutting it down; a suspended virtual
machine is frozen in time at the moment it was suspended. You can resume it and continue where you left off without waiting for it to boot. Clicking the
down arrow next to the pause icon will let you choose from other buttons such as shutdown and restart. It’s important to note that these are hardware
actions, similar to holding down the power button on a physical machine and will not shut down the VM gracefully. Usually you want to restart or
shutdown from within the Virtual Machine through software, but you may need to use these buttons if there is a crash and you can’t get to the shutdown
within the guest OS.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Networks
Networks
||||||||||||||||||||
||||||||||||||||||||
Types of Networks
Types of Networks
A Local Area Network or LAN is a computer network that connects computers within a limited or local geographic area. Typically this is done over
Ethernet cables or Wireless (Wi-Fi). These can be home networks or even office networks. Each computer on a local network will have its own private IP
address which is unique. These IP addresses are not typically addressable from the Internet.
Similarly, the entire network will typically only have a single public IP address per Internet connection. That means many computers will share a set of IP
addresses on the Internet. The process of converting between private IP addresses and public IP addresses is called NAT or Network Address Translation.
It was designed when the Internet was running out of IP addresses so that multiple computers could share a single IP address.
A Wide Area Network or WAN is a computer network that connects computers within a large or wide geographic area. Often these connections are
established via leased lines or VPN (Virtual Private Network) connections. It is possible to join multiple LANs into a WAN. A good example of this type
of network is the Internet.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Network Hardware
• Router
• Switch
• Hub
Network Hardware
We need to consider the different types of networking hardware which can make up a network. The key three are Routers, Switches and Hubs. We will
discuss each in depth.
||||||||||||||||||||
||||||||||||||||||||
A router is a network device which typically sits on the border between two networks and acts as a bridge to connect them. Typically a router will be
connected to both your LAN and to the Internet. The router’s job is to forward traffic onward from computers on the LAN to the Internet and from the
Internet back to the LAN. This behaviour is the reason a router is often called a ‘gateway’. On a wider scale, the Internet functions on a massive network of
routers which decide where to send traffic.
Technet24
||||||||||||||||||||
||||||||||||||||||||
A switch is responsible for routing traffic within a LAN. While a router can only decide if the traffic should be sent to the LAN or the Internet, the switch’s
job is to decide which computer on the LAN traffic should be sent to.
In order to do this, the switch looks at MAC addresses, or hardware addresses. These addresses are unique to each network card. A network protocol called
ARP (Address Resolution Protocol) maps IP addresses to MAC addresses. The switch keeps a table in its memory of which IP address maps to which
MAC address, and uses that to determine where traffic should be sent.
You may see cases where a Router also has the capabilities of a Switch built in. These are two-in-one devices, but you should make a clear delineation of
the responsibilities of each in your head.
||||||||||||||||||||
||||||||||||||||||||
A hub serves the same purpose as a switch, in that it is responsible for sending traffic on a local network. Unlike a switch, however, hubs are not
intelligent. They don’t decide which computer should have what traffic; instead, they send all traffic they receive to every computer connected to the hub.
When the computer receives such traffic, it then must decide to discard the data (if it wasn’t meant for that computer) or to accept the data.
Hubs are not commonly used anymore except for one purpose: Wireless. All Wireless access points are hubs, and that is the nature of wireless. A switch
can decide which computer to send data to because the computers are connected by cables to each port and the switch can choose which cable to send the
data through.
With wireless, however, data is transmitted over radio waves, and radio waves can’t be directed to specific computers. The nature of wireless means that all
traffic will be sent to every computer. This is the behaviour of a hub.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Network cards in a computer allow them to connect to networks. Some network cards are wired only, while some also support wireless. Some might even
be wireless only. Usually the functionality of network cards is built into the motherboard these days, but you can still buy them as separate cards and install
them.
Each Network Card (NIC – Network Interface Card) has a MAC address which is burned onto the card. The MAC address is how the Switch knows which
computer to send data to. Although the MAC address is burned into the network card, they can be spoofed at the Operating System level.
||||||||||||||||||||
||||||||||||||||||||
Technet24
||||||||||||||||||||
||||||||||||||||||||
In the OSI Model, the sending computer works its way down the OSI model from the top. Each layer adds to the data packet. When you reach the bottom
of the stack, at the Physical Layer, data is transmitted to the receiving computer. The receiving computer, having received the data packet at the Physical
Layer, works its way back up the stack. As it moves it removes layers of data from the packet until it reaches the Application Layer, where the original data
is finally received by the application.
||||||||||||||||||||
||||||||||||||||||||
The application layer is where user interaction occurs. It is here that the user inputs data into the application and receives data back from the application.
Examples of Application Layer protocols are:
Technet24
||||||||||||||||||||
||||||||||||||||||||
At the presentation layer is the Operating System itself. When a user interacts with an application at the Application Layer, the application itself interacts
with the presentation layer for displaying data.
||||||||||||||||||||
||||||||||||||||||||
The session layer handles the creation and maintenance of sessions between the Presentation Layer and other computers. For example, a user browsing the
Internet with a web browser interacts with the Application Layer. The application layer interacts with the presentation layer, and the presentation layer
interacts with the session layer which interacts with the web server to establish a session.
Technet24
||||||||||||||||||||
||||||||||||||||||||
• The Transport Layer controls what and how much data is sent over
an established session
The transport layer controls what and how much data is sent over an established session.
||||||||||||||||||||
||||||||||||||||||||
The network layer is responsible for where a ‘packet’ of data is sent and where it originates. For example, routers are often classified as a layer three
device because they operate on the network layer.
Technet24
||||||||||||||||||||
||||||||||||||||||||
The data link layer is responsible for transmitting data between two directly connected nodes. This is done through the use of MAC addresses (hardware
addresses). The data link layer is also responsible for error checking. For example, switches operate at layer 2 (data link).
||||||||||||||||||||
||||||||||||||||||||
The physical layer is represented by the physical cables (or radio waves) that make up a network and the electrical signals that pass over them.
Technet24
||||||||||||||||||||
||||||||||||||||||||
IP Addresses
IP Addresses
IP Addresses or Internet Protocol Addresses are the naming scheme of the Internet. They identify both the computer and its location on a particular
network. There are two types of addressing schemes.
The original was IPv4, and it is still widely used today. However, the problem is that it was never designed for the volume of computers that exist today.
Thus, we are rapidly running out of IPv4 addresses. To get around this issue, NAT (Network Address Translation) was developed, with certain IP ranges
designated as private IP address ranges. With NAT an entire LAN can use a single public IP address to access the Internet. This has helped up stave off
running out of IPv4 addresses, but with the development of IPv6 this problem is less pressing.
IPv6 is newer and is slowly gaining traction. It looks intimidating because the addresses are significantly longer than IPv4, so many people are reluctant to
use it. However, there are enough potential IP addresses in the IPv6 range to assign an IP address to every single atom on the surface of the earth - and still
not run out.
||||||||||||||||||||
||||||||||||||||||||
IP Addresses: IPv4
IP Addresses: IPv4
Each section separated by ‘.’s is represented by 8 bits (1 byte). In total, the IPv4 address range is represented by 32 bits. An example of an IPv4 address:
192.168.0.1.
Technet24
||||||||||||||||||||
||||||||||||||||||||
IP Addresses: IPv6
• IPv6 Format:
XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX
• 16 bits separated by colons, 128 bits in total
• E.g.: FE80:0000:0000:0000:903A:1C1A:E802:11E4
• Can be shortened by reducing one row of zeros to ::
• E.g.: FE80::903A:1C1A:E802:11E4
• Only one row of zeros can be shortened
• Other blocks of zeros can be shortened using :0
• E.g.: FE80:0000:0000:0000:903A:0000:0000:11E4
• To: FE80::903A:0:0:11E4
IP Addresses: IPv6
IPv6 can be tricky because of the format. There are shorthand ways to write it because it’s just so long…
If there are a series of 0s all in a row, you can shorten them to :: but only for one row. If it was broken up into two rows of 0s, only one of those rows could
become ::.
Taking our previous example of FE80:0000:AABB:0000:903A:1C1A:E802:11E4, we could shorten it using the above rule to:
FE80::AABB:0000:903A:1C1A:E802:11E4
||||||||||||||||||||
||||||||||||||||||||
A private IP address is only accessible on the local network. Blocks of IP addresses have been reserved for this purpose. On IPv4 The reserved addresses
are:
Technet24
||||||||||||||||||||
||||||||||||||||||||
A public IP address identifies devices that are connected to a WAN. Most commonly these devices are routers that are connected to the Internet. On a local
network, each device will have its own private IP address. When they communicate with the Internet, however, if all those devices share one Internet
connection then they will all share the same public IP address.
The reason there is a distinction between private and public IP addresses is simply that we started to run out of public IP addresses. There aren’t enough for
every single computer connected to the Internet to have its own public IP address. To get around that, certain IP addresses were reserved as private IP
addresses. These aren’t addressable over the Internet, but they are addressable over the LAN. Network Address Translation (NAT) is then used to translate
public IP addresses to private IP addresses. This allows multiple computers to share one public IP address, but they can still communicate on the local
network with different private IP addresses. This has allowed the IPv4 address range to stretch farther than it would have originally, but the lack of address
space is still a growing problem which should have been resolved with the implementation of the IPv6 protocol, if it were not for the low adoption rate.
||||||||||||||||||||
||||||||||||||||||||
IP Addresses: Netmask
IP Addresses: Netmask
IP Addresses are split up into two sections: the first section is the network identifier, and the second section is the host identifier. The network identifier
tells us which network the traffic is destined for, and the host identifier tells us which computer on the network the traffic is destined for.
Here’s the confusing part: the size of the network identifier and the size of the host identifier can change depending on the size of the network. To work out
which is which, we have to use the netmask. An IPv4 Address has four bytes, and so does a netmask. In the example of a network, such as 10.0.0.0 –
10.255.255.255, the first byte is the network identifier. The rest is the host identifier. So we write the netmask as: 255.0.0.0. We can also use CIDR
notation to write the netmask in shorthand.
Technet24
||||||||||||||||||||
||||||||||||||||||||
CIDR is a way of representing the netmask in shorthand. Essentially you write a / and then put in the number of bits in the network identifier after the /. So
for a netmask of 255.0.0.0, the first byte is the network identifier. A byte is 8 bits and therefore a the network is a /8 network.
||||||||||||||||||||
||||||||||||||||||||
Remember each of the four sections in an IPv4 address is represented by 1 byte or 8 bits. So for example with the IP address: 192.168.0.1, everything
before the first ‘.’ is represented by 8 bits, the next section is 8 bits, the one after that is 8 bits and the final section is also 8 bits. With CIDR, all you’re
doing is saying how many bits form the network identifier.
So for 192.168.0.0/16 all you’re saying is that the first 16 bits is the network identifier. That’s the first two sections of the IPv4 address field (the 192.168).
The rest belongs to the host identifier.
Technet24
||||||||||||||||||||
||||||||||||||||||||
IP Addresses: Broadcast
The broadcast address is an IP address on a local network reserved for ‘broadcasting’ data to every node on that local network. Usually the largest possible
address is reserved for the broadcast address - however this is not always the case.
The broadcast address can be manually changed. When the Switch receives data destined for the broadcast address, it will know to send it to all computers.
Similarly over Wi-Fi, or on a Hub, data will be sent to all connected computers anyway - however those computers that receive data that was destined for
the broadcast address will know not to discard the data packets.
||||||||||||||||||||
||||||||||||||||||||
MAC Addresses
MAC Addresses
A MAC address or ‘Media Access Control’ address is an address that is assigned to a network interface card. Each MAC address is unique to the card and
is physically burned into the card. The MAC address identifies computers on a local network, even more so than IP addresses do. This is because IP
addresses can change, whereas MAC addresses always stay the same. Of course MAC addresses can be spoofed and changed that way, but in the normal
course of operations they should never change.
Technet24
||||||||||||||||||||
||||||||||||||||||||
ARP (1)
ARP (1)
Address Resolution Protocol is a network protocol designed to map IP addresses to MAC addresses on a local network.
If a computer needs to communicate with another on the local network, it will first look in its ARP table to see if it has cached, or stored, the MAC address
for that IP address. If the result hasn’t been cached, then the computer will send an ARP request to the broadcast address of the local network.
Every computer on that local network will get the ARP request asking if anyone knows what the MAC address is for the IP address in question. The
computer at that IP address will send an ARP response packet to the computer that sent the request, informing the computer of its MAC address.
The computer which sent the request will then cache the result in its ARP table for future reference.
||||||||||||||||||||
||||||||||||||||||||
ARP (2)
ARP (2)
Technet24
||||||||||||||||||||
||||||||||||||||||||
DNS (1)
DNS (1)
DNS or Domain Name System is a way of mapping IP addresses to more memorable domains (such as google.com). The Internet would be almost
unusable if everyone had to remember numbers to navigate their way around. It would be even worse with IPv6. This is where DNS comes in. DNS allows
us to translate domain names into IP addresses, and IP addresses into domain names.
When you enter google.com into your browser, your computer looks at the DNS server assigned to your computer in the network configuration. It sends a
DNS query to that server asking for the IP address of google.com. If the query is cached, the DNS server will respond. If it isn’t cached, the DNS server
will need to look it up. It will first query the Top Level Domain of the domain name. This is the bit that comes after the final ‘.’ such as ‘.com’. It will look
at the DNS servers that are authoritative for ‘.com’. Those domain name servers will respond with the IP address of the domain name servers that are
authoritative for google.com. Then another DNS query will be sent to the google.com DNS servers which will respond with the IP address for google.com.
DNS is a very robust system designed to account for constant change in the structure of the Internet. Although it seems like there are a lot of queries just to
get to a single IP address, it is designed like this so that the loss of any one DNS server will not affect large swathes of the Internet.
||||||||||||||||||||
||||||||||||||||||||
DNS (2)
• Your DNS server receives the query and breaks down the domain
name. Google.com ends in ‘.com’ so it looks at the DNS servers for
.com (The Top Level Domain or TLD servers).
• The DNS server for the TLD will return back the IP address of the
DNS server for google.com
• The next query will be sent to the domain’s DNS server, which will
respond back with the IP address
• Your browser will now know which IP address to send the request
for the web page to
DNS (2)
When your DNS server receives a query for a particular domain (take google.com for example), it first breaks down the domain into two parts. Everything
after the ‘.’ is the TLD or Top Level Domain. In our example of google.com the TLD is .com.
Your DNS server will then query the DNS servers that are authoritative for the ‘com’ TLD, asking for ‘google.com’. The ‘com’ DNS servers will respond
with the IP addresses of the authoritative DNS servers for google.com. Then your DNS server will query those DNS servers asking for ‘google.com’ and
they will respond with the IP address of google.com.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Thus far, we have looked at how a ‘forward lookup’ works. That is, translating a domain name into an IP address. We can also perform a ‘reverse lookup’,
which is the process of translating an IP address into a domain name. Doing this requires a pointer record to have been set up. This involves adding a DNS
record of the IP address and adding ‘.in-addr.arpa’ to the end of it. This makes a domain name of a sort. The pointer record should point to the domain
name that belongs to the IP address.
By doing this, reverse lookups become possible - because if you know an IP address you can simply add ‘.in-addr.arpa’ to the end of it, and look up the
DNS record to find the domain name that the IP address belongs to.
||||||||||||||||||||
||||||||||||||||||||
DHCP
DHCP
DHCP or Dynamic Host Configuration Protocol is a network protocol which can allow computers that join the network to automatically configure their
network settings upon joining a network. The way it works is the DHCP server has a pool of IP addresses it can lease to clients joining the network. Once it
leases an IP address to a client it removes it from the pool to avoid IP address conflicts. Each IP address assigned is leased and set to expire after a set
duration. Once the lease expires, the client may be given a different IP address. The DHCP server will automatically configure a client’s IP address,
netmask, gateway address and DNS servers.
Be careful of ‘rogue’ DHCP servers, which is where there is more than one DHCP server running on a network by mistake. In this case, there will be two
pools of IP addresses and they won’t be in sync - so you may start to get IP address collisions.
The opposite of DHCP is when you configure the network settings of a client yourself. This may be more convenient in some circumstances, particularly
where you are dealing with servers which always need to have the same IP address. Just make sure you don’t assign an IP address that is part of your
DHCP pool, otherwise you may start to see IP address collisions.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Packets (1)
Packets (1)
We’ve talked a lot about sending ‘data’ so far. Data is transmitted as ‘packets’ over a network. Packets tend to be small, so most data is broken up into
chunks in preparation for transmission over the network. Each data packet contains a ‘header’ and a ‘payload’. The header describes the packet, where it
has come from, where it’s going, how far it can go before it expires, the sequence number if the data was split up - that sort of thing. The payload is the
data that is sent in the packet.
||||||||||||||||||||
||||||||||||||||||||
Protocols
Protocols
A protocol is a defined set of rules for communication. Similarly as with people, computers need to know how to speak the same language before they can
communicate. We can use TCP as an example.
TCP stands for ‘Transmission Control Protocol’, which is a network layer protocol. In TCP, the computer initiating the connection contacts the receiver
and sends a SYN (synchronise) packet which indicates it wants to establish a connection. The receiving computer receives the SYN packet and responds
with a SYN ACK (synchronise-acknowledge) packet.
The sender then responds with an ACK (acknowledge) packet and the session is then established. Data is sent, and when the connection is to be terminated
one side will send a FIN (finish) packet. The other side will then send a FIN ACK in response, and finally the sender will send an ACK and the connection
will be closed. This is the ‘language’ of TCP.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Ports (1)
Ports (1)
A port is essentially a communications channel. Each channel is numbered, and no two applications can listen on the same port. Ports are used to separate
out network traffic destined for separate applications that are running on a computer. For example, you may have a web server and a VMware server
running on the same computer. The web server will accept traffic coming in on ports 80 and 443 – meanwhile, the mail server may accept incoming
connections on port 25. It would be confusing if a web server was receiving traffic meant for the e-mail server and vice versa, which is why ports are so
important. They separate out traffic, so that only the application that needs that data will receive it.
Often the notation: 127.0.0.1:80 is used to write port numbers. The first part is the IP address, the colon separates the IP address from the port number, and
finally there is the port number.
||||||||||||||||||||
||||||||||||||||||||
Ports (2)
Ports (2)
Of course, those are just the ports that are used by default. In many applications you can customise the port that the application listens to. For example, it is
common practice to run a webserver on port 8080, if you already have one on port 80 on the same computer.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Packets (2)
Packets (2)
In packets, the two most commonly used Transport Layer protocols are TCP (Transmission Control Protocol), and UDP (User Datagram Protocol). There
are other protocols, but our focus will be primarily on these two.
||||||||||||||||||||
||||||||||||||||||||
The TCP/IP model is similar to the OSI model. It’s actually not just for TCP or IP: it also applies to other communications protocols. It’s only called
TCP/IP because those were the first two protocols defined using it when the model was developed. It contains the same basic elements as the OSI model,
but some of the layers have been combined.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Here we can see the TCP/IP model has merged the responsibilities of the Application, Presentation and Session layers into the Application layer. Similarly
the Data Link Layer and Physical Layer have been merged into the Network Access Layer. The functions they perform at every stage remain the same; it’s
just that they have been grouped up.
||||||||||||||||||||
||||||||||||||||||||
• Application Layer Protocols deal with the format the payload data
of a packet will be in so it can be understood by the recipient
application
• Examples of Application Layer Protocols:
• HTTP
• DNS
• SMTP
• SMB
At the application layer we are dealing with the contents of the data packet and not any of the headers. At the application layer, the data sent in the packet
payload needs to be in the correct format to be understood by the application on the receiving end. This format is usually called the Application Layer
Protocol. Examples of that would be HTTP, DNS, SMTP, SMB or FTP.
Technet24
||||||||||||||||||||
||||||||||||||||||||
The transport layer protocol is the protocol that deals with how two computers communicate. Depending on the protocol, this can involve reliability
checking, error checking, flow control, same order delivery and various other features. Examples of transport layer protocols include TCP and UDP.
||||||||||||||||||||
||||||||||||||||||||
TCP Protocol
TCP Protocol
The TCP protocol is a protocol that allows for reliable delivery of data across a network. Some of the main features of the TCP protocol include:
Error checking – TCP packets are checked for errors that were introduced into the packet during transmission. If the error-checking fails it indicates the
data packet was corrupted in transit and needs to be re-transmitted.
Ordering – TCP packets are ordered so that they can be re-assembled in the correct order by the recipient.
Re-delivery – TCP packets are numbered according to a sequence. If one packet is missing because it was dropped in transit, then the recipient knows and
will ask for that particular packet to be re-transmitted.
All in all, the TCP protocol is a reliable communications method in that it makes sure that all the data gets to the destination. However, all these features
means the protocol itself has a high overhead and is considered relatively slow – and so not appropriate for all applications.
Take video chat for example: With video chat, frames in the video are transmitted over the Internet. Each frame is a still image and each image is played on
the monitor at a fast rate, typically more than 30 frames per second. This gives the impression of a moving picture. The TCP protocol is not a good fit for
this kind of application. First of all, TCP is quite a slow protocol so those with a slower connection will notice much more latency than they would if the
video chat application communicated over UDP, but the main reason is that re-transmission is unnecessary. In TCP, if a few packets were to drop, then
everything would pause until those packets could be re-transmitted. In a video chat application, dropped packets means dropped frames. Would you notice
if 2 frames out of 30 were dropped from the video? Unlikely. Would you want the video to pause until those two frames could be re-transmitted when you
wouldn’t even notice if they weren’t there? It wouldn’t be a very good design choice.
Technet24
||||||||||||||||||||
||||||||||||||||||||
TCP Header
TCP Header
Here we have the TCP header protocol. The header is added to the payload from the application layer. This is known as encapsulation.
Notice how there is no destination IP address here: that is not handled at the network layer.
• Source Port – Which port sent the data. This is usually randomly generated for the sender from a list of available ports. These are usually known as
‘ephemeral’ ports.
• Destination Port – Which port the data is going to.
• Sequence Number – When the first SYN is sent, a random sequence number is generated. When it receives a SYN-ACK in response, the sequence
number will be the same as it was when the SYN was sent.
• Acknowledgement Number – This value is the sequence number incremented by 1. It indicates that the first SYN was received and that the
recipient knows the sequence number to establish the connection.
• Header Length – The size of the TCP header.
• TCP Flags – These flags are used to determine what type of TCP packet is being sent. For example, SYN, SYN-ACK, FIN, RST, etc.
• Window Size – Window Size is used in flow control to make sure that the number of packets don’t overwhelm the network or the recipient.
• TCP Checksum – The checksum is used to validate if the data was corrupted in transit. If the checksum fails then the packet must be retransmitted.
• Urgent Pointer – A pointer that can be used to mark a section of data in the packet’s payload as urgent.
• TCP Options – Other TCP options can be set here, but we won’t be going into them.
||||||||||||||||||||
||||||||||||||||||||
TCP Handshake
SYN: seq = 55
TCP Handshake
This is an example of a TCP handshake for setting up a connection between two computers.
The client sends a SYN packet to the server. The sequence number is randomly generated. Let’s use 42 for our example.
The server then responds with a SYN-ACK packet. The sequence number stays the same, in this case 42. The acknowledge number will be 42 + 1 = 43.
The client then responds with an ACK packet, incrementing the sequence number. So the sequence number will be 43 and the acknowledgement number
will also be 43.
After that, the connection is considered to be established and data can be sent.
Technet24
||||||||||||||||||||
||||||||||||||||||||
TCP Teardown
TCP Teardown
The teardown is the opposite of the handshake. It’s used for closing a connection. Here the client sends a FIN packet to the server to initiate the teardown
procedure. The server responds with a FIN-ACK. The client responds with an ACK, and the connection is considered to be terminated.
||||||||||||||||||||
||||||||||||||||||||
TCP Reset
TCP Reset
The TCP Reset packet is used to forcefully teardown a connection from one side. This usually happens if there’s a problem with the connection. Either side
can send an RST packet to end the connection immediately.
Technet24
||||||||||||||||||||
||||||||||||||||||||
UDP Protocol
UDP Protocol
The UDP protocol is a connection-less protocol which prioritises speed over reliability. This means that there is no handshake or teardown procedure. There is
a checksum, but if it fails then there is no request for re-transmission. Essentially, there is no guarantee that UDP packets will ever get to their destination.
UDP is the preferred protocol for applications where speed is the priority over reliability. Such applications include things like video chat, where a dropped
packet means a skipped frame on the video but there is no time to re-transmit it.
||||||||||||||||||||
||||||||||||||||||||
UDP Header
UDP Header
Here we have a UDP header. Again, note that there is no destination IP address. That is not handled at the network layer.
There is a UDP checksum, which is used to determine if a packet was corrupted in transit or not – though such packets are only discarded and not re-
transmitted.
Note also that there are no sequence numbers or acknowledgement numbers. That is because there is no sequence: UDP does not re-order packets or check
for dropped packets, so sequence numbers are not necessary.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Ports (3)
Ports (3)
TCP and UDP ports are separate from each other. You can have an application listening on TCP port 80, and a different application listening on UDP port
80 without a clash.
||||||||||||||||||||
||||||||||||||||||||
Internet Layer
Internet Layer
At the Internet layer, protocols deal with transmission of data packets across network boundaries. That usually means from the local network out onto the
Internet. The Internet layer is responsible for determining the next ‘hop’ on the route to the packet’s destination.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Internet Protocol
Internet Protocol
The Internet protocol layer is the part of the header that deals with IP addresses. You likely noticed that in the Transport Layer there were no fields for a
source or destination IP address. That is because the layer that handles that is the Internet Layer. IPv4 has its own header, and IPv6 similarly has its own
header.
||||||||||||||||||||
||||||||||||||||||||
IPv4 Header
IPv4 Header
The IPv4 header is responsible for getting the packet to the final destination. Let’s go through some of the fields here:
• Version – Is the header for an IPv4 packet or IPv6. It’s IPv4 if the value in the Version field is ‘4’.
• IHL – Internet Header Length. An IPv4 header is of variable length depending on what options are set, so you have to tell it how long the header is
here. The minimum is 20 bytes, the maximum is 60 bytes.
• Total Length – This field defines the entire packet size including the network layer header and the payload data.
• Identification – When a packet grows too large it is split up or ‘fragmented’ into separate packets. Each of those fragmented packets will have the
same identification value so the computer knows they were originally one packet.
• Flags – These flags are used to control fragmentation. Usually no flags are set, but if the DF (Don’t Fragment) flag is set then the packet can’t be
fragmented, and if it can’t be sent without being fragmented then the packet is dropped instead. The MF (More Fragments) flag on the other hand
indicates that the packet has been fragmented and to expect more fragments to come in. All fragmented packets have the MF flag set except for the
last packet.
• Fragment Offset – This specifies where in the original packet the data in a fragmented packet was located. This helps when fragmented packets
must be re-assembled.
• Time To Live (TTL) – The TTL is the number of hops left in the packet before it expires. The TTL is decremented at every hop. This prevents
transmission loops where data packets keep going endlessly.
• Protocol – The protocol used in the network layer. For example, TCP, UDP, OSPF, etc.…
• Header Checksum – This is used for error checking of the packet’s header.
• Source Address – The IP address of the sender.
• Destination Address – The destination IP address.
Technet24
||||||||||||||||||||
||||||||||||||||||||
• Options – Not frequently used, but if it is used you must update the IHL field.
||||||||||||||||||||
||||||||||||||||||||
IPv6 Header
IPv6 Header
Similar to the IPv4 header, but it has a fixed length this time. It has also been greatly simplified.
• Version – The version of the IP protocol being used. For IPv6 it’s 6 (surprise!).
• Payload Length – The length of the payload including any other headers such as TCP headers.
• Next Header – Specifies the type of the next header. This is usually a protocol such as TCP.
• Hop Limit – Like the TTL from IPv4, it indicates how many hops left before the packet expires.
• Source Address – The IP address the packet came from.
• Destination Address – The IP address the packet is going to.
Technet24
||||||||||||||||||||
||||||||||||||||||||
ICMP
ICMP
The ICMP protocol is classified as an Internet layer protocol, but it must be used with Internet Protocol. It is a protocol used to send messages about errors
in Internet Protocol operations.
We typically most see ICMP being used in ping packets. A ping packet is an ICMP Echo Request which is used to judge if a host is online or offline.
||||||||||||||||||||
||||||||||||||||||||
ICMP Header
ICMP Header
Here we have an example of an ICMP header. Notice there is no IP information here. Although it’s also an Internet layer protocol, it still must be used with
Internet Protocol. Let’s go through some of the fields.
• Type – The type of ICMP control message to send (see below for the list).
• Code – The subtype of ICMP control message to send (see below for the list).
• Checksum – Error checking.
• Rest of Header – Varies depending on the type and code of the ICMP packet.
Technet24
||||||||||||||||||||
||||||||||||||||||||
The Network Access Layer or Link Layer is used to define how a network will transmit the packet. This layer deals with internal network communication,
as opposed to the Internet Layer which deals with data transmission across networks. The Network Access Layer interfaces with physical hardware. It is at
this layer that MAC addresses are added to the packet. An example of a Link Layer protocol is Ethernet.
||||||||||||||||||||
||||||||||||||||||||
Ethernet Frame
Ethernet Frame
The Ethernet Frame is added at the Network Access Layer. It handles packet transmission within the network. Here we can see the destination and source
MAC addresses have been added to the packet, and also a footer. This is the reason it is called an Ethernet Frame, and not an Ethernet Header, because it
consists of a header and a footer both.
Technet24
||||||||||||||||||||
||||||||||||||||||||
• Data moves from the top level of the TCP/IP Model down the stack
• As data moves down the stack, new headers are added. This
process is known as ‘encapsulation’
Data is generated at the Application Layer. It is sent down the stack to the Transport Layer, where a header is added. The data with the Transport Layer
protocol added is then sent down the stack to the Internet Layer where another header is added. That data, with both headers, is sent down the stack further
to the Link Layer where the data is encapsulated further with a Frame. Finally, from the Link layer data is sent to the Physical Layer which would be the
Ethernet cable or Wireless signal that is used to transmit the data to the network.
When the data is received at the Physical Layer of the receiving computer, it works its way up the TCP/IP stack. At each level, the topmost header is
removed. First the Link frame is removed, then the Internet header, then the Transport header and finally the Application will receive the raw data.
||||||||||||||||||||
||||||||||||||||||||
Technet24
||||||||||||||||||||
||||||||||||||||||||
Encoding
Encoding
||||||||||||||||||||
||||||||||||||||||||
Encoding (1)
Encoding (1)
The process of encoding is simply converting data from one format to another. It is not to be confused with encryption. Encoding is primarily used for
efficient storage of data, or for easier transmission of data. We’ve already done some encoding when we were converting binary to decimal and vice versa.
It’s important to remember that a computer only understands binary. So the text you are seeing on the screen is only binary, but it has been encoded into a
format that we can understand. In the case of text, we’re talking about using some form of character encoding such as ASCII (American Standard Code for
Information Interchange) where each binary value maps to a letter in a table.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Encoding (2)
Encoding (2)
One form of encoding you’ll see more and more of is Base64. Base64 is a numbering system, like Binary which is Base 2 and Hexadecimal which is Base
16. Similar to the problem we had in Hexadecimal where we didn’t have enough digits to count high enough so we had to use letters, Base 64 uses a
combination of digits, characters and symbols. You can often recognise Base64 encoded data by the ‘=‘ sign at the end, although there isn’t one always
present in many cases there will be one or two. Take ‘hello’ encoded in Base64 as an example: aGVsbG8=
||||||||||||||||||||
||||||||||||||||||||
Encoding – ASCII
Encoding - ASCII
ASCII or the American Standard Code for Information Interchange is a form of text encoding where binary values can be mapped to letters and symbols.
Take a capital ‘A’ for example: according to the ASCII table, it has the decimal value of 65 or the hexadecimal value of 41. Of course, in memory
everything is just numbers - so the computer has to know how to interpret the data.
ASCII isn’t the only key encoding used, but it is the most common. There are other types such as Unicode to consider.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Here is a chart of all the ASCII code values and what they map to. For those on a windows PC with a number pad, you can try using these values with ALT
codes. Hold down the ‘ALT’ key and type on the number pad: 65 then release ALT. You will see the capital letter A appear on screen. 65 is of course the
decimal value of a capital ‘A’ in ASCII. Try it again with some other values.
||||||||||||||||||||
||||||||||||||||||||
Encoding – URL
• If you ever typed a URL into your browser with spaces in it, you
may have noticed this
• The URL is changed automatically by your browser and each
instance of a space was replaced with %20
• %20 is the URL encoded form of a space
• Flip back to the ASCII table and look up the [space] entry. In
Hexadecimal, it’s 20
• So the %20 just means the ASCII character that is represented by
0x20 or 20 hexadecimal
Encoding - URL
URL Encoding is a type of encoding commonly seen on the Internet. Browsers will often do this automatically for you so you don’t have to worry about it,
but it’s important to keep it in mind. URL Encoding involves using the % sign to signify that what follows next is a URL encoded value. Let’s take a
‘space’ for example: a ‘space’ can be encoded as %20.
Take another look at the ASCII chart, and look up ‘space’. You’ll notice the hexadecimal value is 20. So that’s all URL encoding is - you can use ASCII
values so long as you precede them with a % sign. You could even URL encode entire links. Look at this link:
https://www.%67%6F%6F%67%6C%65.com
Where does it go? It goes to google.com.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Encoding
Encoding
Encoding and encryption often go hand in hand. A form of encoding is usually used to normalise values that come out of the encryption process. Think
about it: if you encrypt some data, you may get numerical values that don’t have an ASCII representation. So how do you see them on the screen? The
answer is you can’t.
What you do see will appear garbled. Some values that appear the same may are actually be different, depending on how values are represented on the
screen when they have no specific ASCII value in the chart. To get around that, we often encode encrypted data in a format such as Base64 which helps
ensure each value has a corresponding ASCII character.
||||||||||||||||||||
||||||||||||||||||||
Encryption
Encryption
Technet24
||||||||||||||||||||
||||||||||||||||||||
Encryption (1)
Encryption (1)
Encryption and encoding are often confused, and with good reason: they are very similar. In fact, encryption is a type of encoding that uses not just the
values you want to change but also a secret key to transform the data into another form. The data cannot be transformed back without knowing both the
process for encryption/decryption and the secret key.
It’s important that you don’t go down the path of using encryption where the encryption process itself is secret, however. Such methods are often highly
vulnerable. Instead, a good encryption algorithm will be one that has been in the public domain for years, even decades. These algorithms will be the ones
that have survived the gauntlet of mathematicians trying to find flaws in them. The only secret should be the key you use, in conjunction with the
encryption algorithm, and not the method for encryption itself.
||||||||||||||||||||
||||||||||||||||||||
Encryption (2)
Encryption (2)
Every means of encryption will eventually be broken. As hardware improves, it will start to become more and more feasible to break encryption by
throwing processing power at it. This is a brute force method that given the rate of hardware advances will eventually become viable for all forms of
encryption. It’s simply a case of making sure you are using an encryption algorithm that can stand up to modern day hardware.
An example of an old encryption algorithm that has been broken due to hardware advances is DES. DES encryption came about in the 1970s, at the time it
was widely used even by the military. It relied on a 56-bit key (officially it was a 64-bit key, but 8 bits of that were reserved for error checking so in
practice it was 56-bits). As of the year 2006, hardware has advanced far enough that DES can be broken in a single day.
Aside from simply throwing hardware at a problem, encryption algorithms themselves come under scrutiny by mathematicians who take cracking
encryption as a personal challenge. Sometimes flaws that no one realised existed do get found.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Symmetric Encryption is a form of encryption where the key is the same if you are encrypting or decrypting data. Symmetric encryption is fast and secure
(depending on the algorithm used for encryption of course), but the problem is key exchange. Both parties have to know the shared key. If you send the key
and then the encrypted message, anyone who intercepted the key could read the message. Therefore the key has to be sent out of band using something like
a text message or in person conversation. Even then, it is not 100% secure: the key could be overheard, or text messages could be intercepted.
The problem is exacerbated when you want to talk about automated systems such as HTTPS encryption where the encryption is automatic. There’s no
leeway for out of band communication in those cases.
||||||||||||||||||||
||||||||||||||||||||
The two aspects that determine the strength of symmetric encryption are firstly the encryption algorithm and secondly the size of the key used. Each
algorithm has a limit to how large the key can be. AES, for example, has a maximum key size of 256 bits (currently). It doesn’t matter how secure the
encryption algorithm is if you use a weak key which can be easily guessed, such as a dictionary word or a small key.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Asymmetric encryption is a form of encryption where the key used to encrypt data is different from the key used to decrypt it. Often we see this in the form
of a public and private key.
The public key is used to encrypt data that is destined for the recipient, while the private key is used to decrypt that data. The public key is shared with
people that you want to be able to message you securely. The corresponding private key is kept privately, and is the only thing which can be used to
decrypt data encrypted with the corresponding public key.
||||||||||||||||||||
||||||||||||||||||||
There are two main issues with asymmetric encryption. Firstly, similar to symmetric encryption the problem is key exchange. In this case, how do we know
the public key we have is definitely the public key for the person we are trying to contact? If you got it in person you could be reasonably certain, but if
you found the public key on a website how can we verify that it belongs to the person we want to contact?
The other problem is speed. Asymmetric encryption has a much higher overhead and therefore is slower. That’s why a combination of asymmetric
encryption and symmetric encryption is used for HTTPS connections.
Technet24
||||||||||||||||||||
||||||||||||||||||||
• PGP stands for Pretty Good Privacy. It’s a standard that aimed to
make encryption common amongst everyone. Unfortunately it
never quite took off.
• It’s asymmetric and uses a public and private key
• The problem was: how do you give your public key to people in safe
way?
• If you put it up on a website, who says that the website wasn’t
compromised, and the key swapped out to one the attacker has?
• The safest way is in-person key exchange, but that is just too
inconvenient
||||||||||||||||||||
||||||||||||||||||||
HTTPS uses SSL/TLS to encrypt traffic between a client and a web server. HTTPS uses a mixture of asymmetric encryption and symmetric encryption.
Each web server has a certificate which is signed by a certificate authority which is trusted. These are usually big-name companies which deal with
certificates. Browsers will automatically trust certificates signed by these certificate authorities. If your certificate is not signed by a trusted certificate
authority somewhere in the chain, it is not considered trusted and most browsers will display a warning.
Technet24
||||||||||||||||||||
||||||||||||||||||||
• Since the certificate is trusted, your browser can use that public key
to encrypt data to send to the web server
• But wait – we said asymmetric key encryption is slow, right? So we
don’t want to use it for ALL traffic
• All we do is generate a symmetric key and encrypt it to send to the
web server
• Therefore, the problem with symmetric key exchange has been
solved, and both parties can use that symmetric key to encrypt
traffic for that session
• And it’s symmetric so it will be fast!
The certificate acts as a public key. The client then generates a symmetric key and encrypts it with the public key of the server, sending the symmetric key
to the web server. The web server will use its private key certificate to decrypt the symmetric key.
From then on, both parties will use symmetric encryption with the shared key. That’s because symmetric encryption is much faster than asymmetric
encryption, so asymmetric encryption is only used for the initial key exchange.
You can see that by using a combination of symmetric and asymmetric encryption, the key exchange problem has been mostly resolved. Asymmetric
encryption is slow, but secure. Using asymmetric encryption, a symmetric key is sent between both parties so both parties will know it. From then on the
symmetric key is used to encrypt data because it is much faster than asymmetric encryption.
||||||||||||||||||||
||||||||||||||||||||
Although the problem of key exchange is mostly resolved, this is far from a perfect solution. What happens if a root certificate authority is compromised?
It’s happened before, where an attacker has been able to compromise one of the companies that provide certificates to most of the Internet and then they
have issued real certificates to malicious actors.
There just isn’t a better system at the moment, luckily scenarios where bad certificates have been issued are few and far between.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Hashing is a form of one way encryption. Data can be encrypted but not decrypted. A good hashing algorithm will produce a unique result for a unique set
of data.
The word ‘hello’ hashed would always have the same hash no matter what. Hashing is most typically used for storing passwords in databases. The idea is
that when someone logs in, the password they enter is hashed and then compared with the hash that was stored in the database. If they match, the password
was correct.
||||||||||||||||||||
||||||||||||||||||||
• Then when a user tries to log in, the password they submit is also
hashed and if the two hashes match then the password is correct.
• Since the password is hashed in the database, just having the hash
isn’t enough to find out the password. You would have to guess the
password, hash your guess and compare to find the password.
• There are programs that can do this hundreds of thousands of
times per second based on dictionary words or by going through
every combination of letters, numbers, etc.
• This is why a strong password is important!
There are programs which can calculate hundreds of thousands of hashes per second. This means that if someone gets hold of a password hash and the
password is weak then it can be quickly cracked. A strong password will mean that the program has to run for years or decades before the password can be
cracked. Programs that crack passwords quickly often benefit from using the processing power of graphics cards which excel at number crunching tasks.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Using a hash on its own to store passwords is not a good idea. Usually you should ‘salt’ a hash first. That means you take the password and add some text
to it before hashing. Even better would be for each account to have its own salt which is stored in the database alongside the hash. This is helpful in
preventing pre-computation attacks (rainbow table attacks).
With pre-computation, people go through the process of words, letter combinations etc., and store the results in a huge file (TBs in size sometimes). Then it
is just a case of looking up the hash in the file, and you will have the resulting plaintext password. By using a salt we can make that more difficult. The
addition of the salt means you have to generate a rainbow table for that exact salt, which takes the same amount of time as just cracking the hash live. By
using a unique salt for each user, and not just for each application, we prevent people from generating a rainbow table that works for the entire application.
||||||||||||||||||||
||||||||||||||||||||
• A salt defeats this by making the same value hashed come out
differently every time
• This way you don’t save any time by using a rainbow table, since
the hashes will be unique to each user in the database and
therefore storing the calculated hashes is pointless
We can use a salt to defeat a pre-computation attack. This is usually a random value which gets added to the data before hashing it. The salt must be a
known value, often it is stored in plaintext next to the hash in a database. Some applications have the same salt across the entire application, but it’s often
best to use different salts within an application also. For example, if you were talking about user accounts, each user account might have their own salt.
This prevents an attacker pre-computing the hashes for your application specifically. For example, if all users had the same salt and the attacker had a copy
of the users table then he could still save time by generating a rainbow table for that specific salt. If every user has a different salt even within the same
database, then generating a rainbow table wouldn’t be worthwhile because you might as well just crack the password if you’re only going to be able to use
that rainbow table to crack one user’s password.
Technet24
||||||||||||||||||||
||||||||||||||||||||
• Examples:
• hash("hello") =
5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03
hash("hello" + "QxLUF1bgIAdeQX") =
68290787dab7259f607c8d9fa58df0ac07ec0a33d151a60127a55f429d09ad52
hash("hello" + "bv5PehSMfV11Cd") =
0a7d4f0c9b8962788d4b6b157e9e6cdf794bf5c9b790915dd24ba5ab962a3fe2
hash("hello" + "YYLmfY6IehjZMQ") =
6034fd9ad717f98425f8135fcfe9de956b7d94d885b371394ee149f8a3ce1e24
• Examples using the SHA-256 hashing algorithm.
Here we can see some examples of using a password with a salt. In this case, ‘hello’ is the password. When we combine it with a randomly generated salt
we see the hashes are wildly different.
||||||||||||||||||||
||||||||||||||||||||
When we talk about hashing we also have to talk about collisions. Hashing algorithms are supposed to produce unique hashes. That means one value will
always come out to the same hash, and no other value will create a hash which matches. That’s the theory at least.
In practice, collisions do occur. Two popular hashing algorithms still used frequently today are MD5 and SHA1. Both of these are considered broken, in
that under some specific circumstances collisions are possible. More recently there is SHA2 and SHA3, which are still considered secure today. There’s
also a newer algorithm called Argon2 which won the Password Hashing Competition in 2015.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Hardware Components
In Depth
||||||||||||||||||||
||||||||||||||||||||
CPU – Components
CPU – Components
The Arithmetic Logic Unit (ALU) is actually two components, the Arithmetic Unit and the Logic Unit. Combined it is responsible for mathematical
functions and logic such as AND, NOT, OR, etc.
The Control Unit (CU) is responsible for the sequence of instructions that the CPU will execute, managing the timing of the CPU (clock cycles),
interpreting instructions and regulating the flow of data to other components along bus lines.
The Registers are small memory storage areas which exist ‘on-die’ (on the CPU itself) and therefore can be accessed very fast, even faster than RAM.
These registers act as temporary storage while the CPU performs instructions. Each register has its own function, such as the EIP register (Extended
Instruction Pointer) which always points to the next instruction to be executed.
Technet24
||||||||||||||||||||
||||||||||||||||||||
The Arithmetic Logic Unit is often referred to as one component, but it actually consists of two separate components. The first component is the Arithmetic
Unit, responsible for mathematical functions such as multiplication, subtraction, addition and division. The Logic Unit is responsible for logical operations
which return a Boolean value (true or false value). We’ll be covering these more thoroughly in the programming book.
||||||||||||||||||||
||||||||||||||||||||
The Control Unit is responsible for the sequence of instructions which the processor will execute. It is also responsible for managing the timing of the CPU
and regulating the flow of data to other components (along bus lines) such as RAM.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Registers are small memory storage areas which live on the CPU. They are very fast, faster even than RAM because of their proximity to the processor.
The registers are designed to hold small amounts of data temporarily. Think of it almost like a cache: data is saved in the registers while it is relevant to the
instruction the CPU is executing at the time, so it doesn’t need to keep being retrieved from RAM or Storage.
Some registers have specific purposes; others are known as general purpose registers. EIP (Extended Instruction Pointer) is an example of a register with a
very specific purpose. It holds the memory address of the next instruction for the processor to execute.
||||||||||||||||||||
||||||||||||||||||||
CPU Registers can hold an extremely small amount of data. On a processor with a 32-bit architecture, each register is limited to holding 32 bits of data. On
a processor with a 64-bit architecture, each register can hold 64 bits of data. This is exactly why you can run programs compiled for 32-bit processors on a
64-bit system, but you can’t run programs compiled for a 64-bit system on a 32-bit processor. 32-bits of data fits into 64-bits after all, but it’s impossible
for 64-bits to fit in 32-bits.
Technet24
||||||||||||||||||||
||||||||||||||||||||
CPU – Architectures
CPU – Architectures
A CPU Architecture is how the processor is designed: including what instructions it supports, the size of its registers and various other factors. The Intel
x86 architecture supports Intel instructions and has a 32-bit architecture. That means the memory registers can hold 32-bit values.
The Intel x64 architecture supports the same set of instructions as x86, but programs that are compiled for 64-bit architectures won’t work on 32-bit
systems. Programs compiled for 32-bit systems will however work on both 32-bit and 64-bit systems. The reason for this is simple. When a program is
compiled for 64-bit systems, the instructions utilise the full 64 bits of the CPU registers. This means that the values that get loaded into the memory
registers won't fit in the memory registers of a 32-bit system.
On the other hand, a program compiled for a 32-bit system will have values in the memory registers which utilise 32-bits. 32-bit values do fit in the 64-bit
registers so the program will be compatible with both architectures.
Of course, there are some architectures which are incompatible because they use a completely different instruction set. Take ARM, for example, which
uses different instructions to Intel. A program compiled for Intel will not run on an ARM processor and vice versa.
||||||||||||||||||||
||||||||||||||||||||
The CPU functions in a fetch – decode – execute cycle where the next instruction is first fetched, decoded and executed, and then the cycle starts again
with fetching the next instruction.
This loop is always working while the computer is on. The number of cycles per second is called the clock rate and it’s often expressed as a value in GHz
such as 3.2 GHz. It’s an indication of the processor’s speed, but it isn’t the only aspect to take into consideration.
Technet24
||||||||||||||||||||
||||||||||||||||||||
CPU – Fetch
CPU – Fetch
At the fetch stage the next instruction is fetched from the memory address contained in the Program Counter. The program counter is also known as the
Instruction Pointer. On 32-bit systems it is known as EIP (Extended Instruction Pointer) and on 64-bit systems it is RIP. The instruction is then stored in the
Instruction Register and the Program Counter is updated to point to the next instruction.
||||||||||||||||||||
||||||||||||||||||||
CPU – Decode
CPU – Decode
The instruction that was fetched into the Instruction Register is decoded by the Instruction Decoder. Each valid instruction is programmed into the
processor and the valid set of instructions is known as the instruction set which the CPU architecture understands. The instruction decoder looks at the
instruction set to understand what it has to do next to execute the instruction.
Technet24
||||||||||||||||||||
||||||||||||||||||||
CPU – Execute
CPU – Execute
At this stage the decoded instruction is executed. When execution is finished the CPU loops back to the fetch cycle again ready for the next instruction.
||||||||||||||||||||
||||||||||||||||||||
When we talk about memory we mean RAM (Random Access Memory). There are two main parts of RAM: the stack, and the heap. The stack is a fast
access area that stores temporary data during program execution. The stack is managed by the processor: the programmer does not usually have to worry
about it. The stack is a memory structure known as LIFO (Last In First Out). The stack is managed by the processor automatically, the programmer does
not have to worry about it. When you have a function in your code, the data on the stack that is created by that function only exists while the function is
running.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Picture a stack of plates. You can’t just take a plate from the middle of the stack: you first have to lift all the plates on top of the one you are trying to reach.
This is a Last In First Out (LIFO) structure, because the last plate you put on the stack is also the first one you can remove - meanwhile the first plate on the
stack has to have all the ones above it removed before you can access it.
||||||||||||||||||||
||||||||||||||||||||
The other section of memory is called the heap. It is relatively slower to access, but unlike the stack there is no size limit and the CPU doesn’t manage it -
so the programmer has to take that into account. That means the programmer is responsible for specifying when he needs to allocate some space on the
heap, and when he is done with that block of memory the programmer is responsible for freeing it up.
Memory on the heap tends to fragment over time. In addition, unlike the stack data on the heap can be read from any part of the program, and not just the
function that created the data.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Memory addresses are a logical construct created by the Operating System kernel to allow access to RAM. These are hexadecimal numbers such as
0xFFFFFFFF. The stack starts from the highest memory addresses and grows down, while the heap starts from the lowest memory addresses and grows up.
Theoretically it is possible for them to meet in the middle - however in practice this should never happen because the stack will keep shifting in size,
growing and then shrinking as needed.
||||||||||||||||||||
||||||||||||||||||||
Here we can see the EBP register points to the base of the stack, while ESP points to the top of the stack. The top of the stack is confusingly on the bottom
because the stack grows in a downward direction. Then there’s the heap which grows upwards.
Technet24
||||||||||||||||||||
||||||||||||||||||||
When we talk about Storage we mean permanent filesystem storage. Each storage device is split up into clusters. A cluster is the smallest logical amount of
space that can be used to hold a file. So if you have a cluster of say 16 kb then a file that is 3 kb will still take up the whole 16 kb. The unused space within
the cluster is known as ‘slack space’.
Similar to RAM each cluster has an address which can be used to access the cluster.
||||||||||||||||||||
||||||||||||||||||||
• The name of the file is stored in an index which lists the starting
addresses where the data can be located
• If a file is deleted, the index entry is deleted and the storage
address is marked as over-writable
• This means that the cluster is free to be re-used. However, the data
is still there unless another file overwrites it.
• This is the reason why you can recover deleted files from a hard
disk
• This is also the reason why secure deletion software should be used
to delete sensitive documents. These tools work by overwriting the
data at least once.
SEC201 | Intro to Computer Fundamentals 173
The specifics of how storage is used depends on the filesystem used. Generally speaking there is an index which stores the name of each file and other
metadata, as well as the address of the first cluster where the file can be located.
When we talk about deleting a file, all we are doing is deleting the entry in the index and marking those clusters with a flag that says they can be
overwritten if necessary. That’s because actually removing data from storage is actually overwriting the data with 0s, but that’s wasteful. It’s much more
efficient to just mark the cluster as over-writable, overwriting the data when more data has to be saved there. That of course is the reason why data can
often be recovered after it has been deleted: the data is still on the disk, marked as over-writeable, but it has not been overwritten yet.
The way secure deletion software works is instead of just marking the data as over-writeable it writes 0s over the data also.
Technet24
||||||||||||||||||||
||||||||||||||||||||
If a file needs to be stored, and it is larger than a cluster, one of three things can happen:
If there are other clusters available that are contiguous (follow on from) the first cluster, the data will be written to those clusters.
If there are no other free contiguous clusters, the rest of the data will be stored at a different address, and a pointer that points to that address will be stored
at the end of the previous cluster.
On some filesystems, a File Allocation Table is used to keep track of where each cluster on a storage device is mapped. The start cluster will point to an
entry in the file allocation table which will in turn contain the address for the next cluster. In turn, that cluster will have an entry in the file allocation table
which contains the address of the next cluster and so on, until the file is read.
||||||||||||||||||||
||||||||||||||||||||
There are many different types of filesystems available. We will be covering some of them in the next few slides:
• FAT32
• NTFS
• ExFat
• Ext3 / Ext4
• HFS+
Technet24
||||||||||||||||||||
||||||||||||||||||||
Storage – FAT32
Storage - FAT32
Definitely an older filesystem. Fat32 was introduced with Windows 95. It doesn’t support files larger than 4GB which can prove an issue these days. It
does not support permissions or any other security features which you often see in more modern filesystems. You can sometimes still see it in removable
media such as USB drives, but not so much anymore with the introduction of ExFAT.
||||||||||||||||||||
||||||||||||||||||||
Storage – NTFS
Storage – NTFS
This is the filesystem that modern versions of Windows use. It has a file size limit, but it’s such a large limit that you won’t hit it on today’s hardware. It
has a lot of nice features such as permissions, encryption and shadow copy (backups). However, it is not very compatible with non-Windows operating
systems. Mac can read from NTFS volumes, for example, but it can’t write to them.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Storage – ExFAT
Storage – ExFAT
ExFAT was introduced in 2006, but it has started to gain more traction. It is supported by nearly every modern Operating System, and is actually very
similar to FAT32: except there are no file size limits to worry about.
ExFAT is still a very minimal file system, with no permissions support or any other features. The benefit to ExFAT however is that it is compatible with
Windows, Mac and Linux - and so is really an ideal filesystem for USB drives and other removable media.
||||||||||||||||||||
||||||||||||||||||||
Storage – Ext3
• Introduced in 2001
• A filesystem for Linux
• Maximum file size of 2TB
• Supports journaling where all changes to the file system are
tracked in a separate part of the hard disk called the ‘journal’. This
means in the event of a crash the chances of the filesystem
becoming unrecoverable are lessened.
• Supports file permissions and other security features
Storage – Ext3
Ext3 is a Linux filesystem. It supports files up to 2TB in size and supports journaling - which is a process whereby all changes to the file system are tracked
in a separate part of the hard disk called the ‘journal’. This means that in the event of a crash the chances of unrecoverable filesystem corruption is
lessened. Similar to NTFS, it also supports filesystem permissions and other security features - although it does not support shadow copy.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Storage – Ext4
• Introduced in 2008
• A filesystem for Linux
• Essentially no file size limitations
• Supports journaling
• Several new features such as fast disk checking which improve
performance and reliability
• Option to disable journaling feature
• Supports file permissions and other security features
Storage – Ext4
Another Linux filesystem. Ext4 is newer and essentially has no file size limitations. Like its predecessor Ext3, it supports journaling as well as newer
features such as fast disk checking to improve performance and reliability. It’s also possible to disable journaling as a feature. File system permissions and
other security features that Ext3 has are also supported.
||||||||||||||||||||
||||||||||||||||||||
Storage – HFS+
Storage – HFS+
This is a proprietary filesystem from Apple which is only compatible with Mac OS. Like any good modern filesystem the file size limits are so large that
there are essentially no limits. It supports journaling and file permissions (including extended permissions) and other security features.
Technet24
||||||||||||||||||||
||||||||||||||||||||
The Graphics Processing Unit (GPU) is a dedicated processor that excels at number crunching. They are useful to the security community because we can
use them to perform high speed encryption tasks.
Usually this means you have a password hash and you wish to crack it. By leveraging the power of the GPU it’s possible to generate millions of hashes a
second (depending on the hashing algorithm). This allows us to speed up password cracking significantly.
||||||||||||||||||||
||||||||||||||||||||
Let’s take a look at the benchmarks for a password cracking rig that runs 8 NVIDIA GTX 1080 GPU’s using the password cracking tool ‘hashcat’:
Hashtype: MD5
Speed.Dev.#1.: 24943.1 MH/s (97.53ms)
Speed.Dev.#2.: 24788.6 MH/s (96.69ms)
Speed.Dev.#3.: 25022.2 MH/s (97.76ms)
Speed.Dev.#4.: 25106.6 MH/s (97.42ms)
Speed.Dev.#5.: 25114.1 MH/s (97.42ms)
Speed.Dev.#6.: 24924.1 MH/s (97.30ms)
Speed.Dev.#7.: 25197.9 MH/s (97.30ms)
Speed.Dev.#8.: 25246.4 MH/s (97.00ms)
Speed.Dev.#*.: 200.3 GH/s
The last figure of 200.3 GH/s is the total password cracking capacity across all cards. To be clear, that’s 200 billion password hashes that can be calculated
per second. Granted MD5 is a somewhat lightweight hashing algorithm, but it is still an impressive figure.
Technet24
||||||||||||||||||||
||||||||||||||||||||
Up Next…
||||||||||||||||||||