You are on page 1of 230

Information Technology Networking

ICTSAS426 - Locate and troubleshoot ICT equipment,


system and software faults

Learner Materials and Assessment Tasks


1|Page
Table of Contents

About ICTSAS426 Locate and troubleshoot ICT equipment, system and software faults ................... 3
Develop a troubleshooting process to help resolve problems ............................................................. 6
Activity 1 ............................................................................................................................................... 31
Analyse and document the system that requires troubleshooting .................................................... 34
Activity 2 ............................................................................................................................................... 39
Identify available fault finding tools and determine the most appropriate for the identified problem
.............................................................................................................................................................. 42
Activity 3 ............................................................................................................................................... 49
Obtain the required fault finding tools ............................................................................................... 50
Identify legislation, health and safety requirements, codes, regulations and standards related to the
problem area ........................................................................................................................................ 56
Collect data relevant to the system ..................................................................................................... 63
Activity 4 ............................................................................................................................................... 75
Analyse the data to determine if there is a problem and the nature of the problem....................... 77
Determine specific symptoms of hardware, operating system and printer problems...................... 84
Activity 5 ............................................................................................................................................. 100
Formulate a solution and make provision for rollback ..................................................................... 103
Systematically test variables until the problem is isolated .............................................................. 115
Activity 6 ............................................................................................................................................. 135
Activity 7 ............................................................................................................................................. 146
Rectify the problem............................................................................................................................ 148
Activity 8 ............................................................................................................................................. 150
Activity 9 ............................................................................................................................................. 153
Create a list of probable causes of the problem ............................................................................... 155
Activity 10 ........................................................................................................................................... 170
Test the system to ensure the problem has been solved and record results .................................. 172
Activity 11 ........................................................................................................................................... 174
Identify and implement common preventative maintenance techniques to support ongoing
maintenance strategies ...................................................................................................................... 176
Document the signs and symptoms of the problem and its solution, and load to database of
problems or solutions for future reference....................................................................................... 181
Activity 12 ........................................................................................................................................... 182
Activity 13 ........................................................................................................................................... 188
Assessment……………………………………………………………………………………………………………………………………191

2|Page
About ICTSAS426 Locate and troubleshoot ICT equipment, system
and software faults
Application

This unit describes the skills and knowledge required to troubleshoot problems and apply systematic
processes to fault finding across a wide range of information and communications technology (ICT)
disciplines.

It applies to individuals who apply a systematic approach to finding faults, troubleshooting problems
and solving issues in a wide range of ICT related areas.

No licensing, legislative, or certification requirements apply to this unit at the time of publication.

Unit Sector

Systems administration and support

Elements and Performance Criteria


ELEMENT PERFORMANCE CRITERIA
Elements describe the Performance criteria describe the performance needed to
essential outcomes. demonstrate achievement of the element.
1. Choose the most 1.1 Develop a troubleshooting process to help resolve problems
appropriate fault finding
method 1.2 Analyse and document the system that requires troubleshooting

1.3 Identify available fault finding tools and determine the most
appropriate for the identified problem

1.4 Obtain the required fault finding tools

1.5 Identify legislation, health and safety requirements, codes,


regulations and standards related to the problem area
2. Analyse the problem to 2.1 Collect data relevant to the system
be solved
2.2 Analyse the data to determine if there is a problem and the nature
of the problem

2.3 Determine specific symptoms of hardware, operating system and


printer problems
3. Identify a solution and 3.1 Formulate a solution and make provision for rollback
rectify the problem

3|Page
3.2 Systematically test variables until the problem is isolated

3.3 Rectify the problem

3.4 Create a list of probable causes of the problem


4. Test system and 4.1 Test the system to ensure the problem has been solved and record
complete documentation results

4.2 Identify and implement common preventative maintenance


techniques to support ongoing maintenance strategies

4.3 Document the signs and symptoms of the problem and its
solution, and load to database of problems or solutions for future
reference

Foundation Skills

This section describes language, literacy, numeracy and employment skills incorporated in the
performance criteria that are required for competent performance.

Skill Performance Description


Criteria
Reading 1.5, 2.1, 2.2  Identifies, analyses and evaluates technical textual
information and technical system data to source solutions
and determine necessary actions

Writing 1.2, 3.4, 4.1,  Records specific information relating to issues and outcomes
4.3 in a sequential manner using correct grammar and spelling

Navigate 1.5  Identifies and complies with organisational and legislative


the world requirements
of work
Get the 1.1-1.5, 2.1-  Takes responsibility for planning, sequencing and prioritising
work done 2.3, 3.1-3.4, tasks and own workload for efficiency and effective outcomes
4.1-4.3  Applies analytical processes to resolve technical or
conceptual problems
 Uses the main features and functions of digital tools to
complete work tasks

4|Page
Unit Mapping Information
Code and title Code and title Comments Equivalence status

current version previous version


ICTSAS426 Locate and ICASAS426A Locate and Updated to meet Equivalent unit
troubleshoot ICT troubleshoot IT Standards for
equipment, system equipment, system and Training Packages.
and software faults software faults
Minor edit to the
competency title.

Assessment requirements - Modification History


Release Comments
Release 1 This version first released with ICT Information and Communications
Technology Training Package Version 1.0.

Performance Evidence - Evidence of the ability to:

 determine the most appropriate fault finding method


 document the troubleshooting process
 analyse and identify faults
 obtain suitable tools and equipment
 apply simple checks, tests and fault finding methodologies
 apply the recommended means to rectify fault and document results.

Note: Evidence must be provided for at least TWO organisations or situations.

Knowledge Evidence

To complete the unit requirements safely and effectively, the individual must:

 explain client support and maintenance practices


 identify and describe current industry accepted hardware and software products, including
features and capabilities
 discuss the system's current functionality, including details of the proposed system
modifications
 describe one or more change management tools
 explain the key features of quality assurance practices with regard to locating and
troubleshooting information and communications technology (ICT) equipment, system and
software faults
 outline the change control procedures of the organisation
 describe a range of trouble shooting methodologies and system testing tools
 list and describe common symptoms of faulty ICT equipment
 identify and describe legislative, regulatory, standards or codes of practice that impact on the
ICT service sector.

5|Page
Develop a troubleshooting process to help resolve problems1

Troubleshooting Overview

Whether an issue stems from a hardware or software problem, you need a reliable troubleshooting
plan. Guesswork and random solutions are unreliable and often unsuccessful. An effective
troubleshooting plan starts with gathering information, observing symptoms, and doing research.

Figure 27-1 illustrates a six-step troubleshooting model used by Microsoft Product Support Services
engineers, who call it the “detect method.”

1
Source: Microsoft, as at https://technet.microsoft.com/en-us/library/bb457121.aspx, as on 9th December,
2016.

6|Page
Figure 27-1 Troubleshooting model

Based on research in problem solving, the six steps of this troubleshooting model are as follows:

 Discover the problem.

Identify and document problem symptoms, and search technical information resources to
determine whether the problem is a known condition. For more information, see “Identify
Problem Symptoms” and “Check Technical Information Resources” later in this chapter.

7|Page
 Evaluate system configuration.

Review your system’s history to determine what configuration changes occurred since the
computer last worked correctly. Did you install new hardware or software? Did you verify that
the hardware or software is fully compatible with Windows XP Professional?

 Track possible solutions.

Instead of using the trial-and-error approach, review Microsoft Knowledge Base articles. You
can simplify troubleshooting by temporarily removing hardware and software that is not
needed for starting Windows XP Professional. Consider enabling Windows XP Professional
logging options to better evaluate your troubleshooting efforts.

 Execute a plan.

Test potential solutions and have a contingency plan if these solutions do not work or have a
negative impact on the computer. Be sure to back up critical system or application files

 Check results.

Determine whether your plan was successful. Have another plan in place to address
unresolved issues.

 Take a proactive approach.

Document changes that you make along the way while troubleshooting the problem. After
resolving the problem, organize your notes and evaluate your experience. Think about ways
to avoid or reduce the impact of the problem in the future.

There are many different things that could cause a problem with your computer. No matter what's
causing the issue, troubleshooting will always be a process of trial and error—in some cases, you
may need to use several different approaches before you can find a solution; other problems may
be easy to fix. We recommend starting by using the following tips2.

 Write down your steps: Once you start troubleshooting, you may want to write down each
step you take. This way, you'll be able to remember exactly what you've done and can avoid
repeating the same mistakes. If you end up asking other people for help, it will be much
easier if they know exactly what you've tried already.
 Take notes about error messages: If your computer gives you an error message, be sure to
write down as much information as possible. You may be able to use this information later
to find out if other people are having the same error.
 Always check the cables: If you're having trouble with a specific piece of computer
hardware, such as your monitor or keyboard, an easy first step is to check all related cables
to make sure they're properly connected.

2
Source: GCF Learnfree, as at http://www.gcflearnfree.org/computerbasics/basic-troubleshooting-
techniques/1/, as on 9th December, 2016.

8|Page
 Restart the computer: When all else fails, restarting the computer is a good thing to try.
This can solve a lot of basic issues you may experience with your computer.

Using the process of elimination

If you're having an issue with your computer, you may be able to find out what's wrong using the
process of elimination. This means you'll make a list of things that could be causing the problem
and then test them out one by one to eliminate them. Once you've identified the source of your
computer issue, it will be easier to find a solution.

Scenario:

Let's say you're trying to print out invitations for a birthday party, but the printer won't print. You
have some ideas about what could be causing this, so you go through them one by one to see if you
can eliminate any possible causes.

First, you check the printer to see that it's turned on and plugged in to the surge protector. It is, so
that's not the issue. Next, you check to make sure the printer's ink cartridge still has ink and that
there is paper loaded in the paper tray. Things look good in both cases, so you know the issue has
nothing to do with ink or paper.

Now you want to make sure the printer and computer are communicating correctly. If you recently
downloaded an update to your operating system, it might interfere with the printer. But you know
there haven't been any recent updates and the printer was working yesterday, so you'll have to
look elsewhere.

You check the printer's USB cord and find that it's not plugged in. You must have unplugged it
accidentally when you plugged something else into the computer earlier. Once you plug in the USB
cord, the printer starts working again. It looks like this printer issue is solved!

Troubleshooting Concepts

The immediate goal of any troubleshooting session is to restore service as quickly as possible.
However, the larger goal is to determine the cause of the problem. Root-cause analysis is the practice
of searching for the source of problems to prevent them from recurring.

Problems represent deviations from known or expected behaviour, and the most effective way to
solve a problem is to gather information before acting and then isolate and eliminate variables.

Identify Problem Symptoms

Start by observing and identifying symptoms of the problem. You need to learn more about the
circumstances in which problems occur and become familiar with system behaviour when issues arise.
Here are some questions that you can use to help identify symptoms:

9|Page
Do error messages appear?

If error messages appear, record the error numbers, the exact message text, and a brief description
of the activity. This information is useful when researching the cause of the problem or when
consulting with technical support. In your description, include events that precede or follow a problem
and the time and date of the error. For complex or lengthy messages, you can use a program such as
Microsoft Paint (Mspaint.exe) to record the error message as a bitmap.

To capture an on-screen error message

1. Click the window or dialog box that contains the error message.
2. To capture the contents of the entire desktop, press PRINT SCREEN (or PrtScn).

– or –

To capture an image of the active (foreground) desktop window only, press Alt+Print Screen
(or PrtScn).

3. In the Run dialog box, in the Open box, type:

mspaint

4. On the Edit menu, click Paste.


5. If the prompt The image in the clipboard is larger... appears, click Yes.
6. On the File menu, type a file name for the image and then click Save.

Error messages might appear before Windows XP Professional starts. For example, motherboard or
storage adapter firmware might display an error message if self-tests detect a hardware problem. If
you are unable to record the message quickly enough, you can pause the text display by pressing
PAUSE BREAK. To continue, press Ctrl+Pause Break.

Did you check Event Viewer logs?

Entries in Event Viewer’s application, security, and system logs might contain information helpful for
determining the cause of the problem. Look for symptoms or signs of problems that occur at frequent
or regular intervals.

Did you check log files on your computer?

Error messages sometimes direct you to view a log file on your computer. The operating system or an
application typically saves log files in text format. By using Notepad or an equivalent text editor, you
can view the contents of a text log file to determine whether it contains information useful for
troubleshooting your problem.

10 | P a g e
Does the problem coincide with an application or activity?

If the problem occurs when an application is running or during activities such as network printing or
Internet browsing, you can reproduce the error to observe details and gather information for
troubleshooting purposes. Be sure to record what applications and features are being used when the
problem occurs.

Do previous records exist?

Check to see whether there are records that describe changes, such as the software installed or
hardware that has been upgraded. If records are not available, you might query users or other support
technicians. Pay special attention to recent changes such as Service Packs applied, device drivers
installed, and motherboard or peripheral firmware versions. This information can help you determine
whether the problem is new or a condition that has worsened.

Is baseline information available?

Baseline information is system configuration and performance data taken at various times to mark
hardware and software changes. If possible, compare current baselines with previous ones to
determine the effects of recent changes on system performance. If previous baselines are not
available, you can generate a baseline to evaluate recent efforts to troubleshoot your current system
configuration.

Does the problem seem related to user profiles?

Do other users who log on to the same computer have similar problems? Are all users who do not
experience problems using Administrator accounts, or do they share other common attributes? For
example, check whether the problem occurs when using a newly created user account.

Does the problem seem network related?

Determine whether the same error occurs on more than one computer on a network. See whether
the error happens when you log on locally or use a domain account.

Is incompatible or untested software installed?

Are you using unsigned or beta drivers? Installing software not fully tested for compatibility with
Windows XP Professional or using unsigned drivers can cause erratic behaviour or instability.

Do you have backups to examine?

If you can establish a time frame for the problem, try to locate earlier system backups. Examining the
differences between current and previous configurations can help you identify system components or
settings that have changed. In addition to examining backups, you can use the System Restore tool to
save or restore system states. By comparing the current state to past states, you might be able to
determine when the changes occurred and identify the components or settings affected.

11 | P a g e
Check Technical Information Resources

After you gather information about key symptoms, check internal and external technical information
sources for ideas, solutions, and similar or related symptoms reported by others.

Information resources such as Windows XP Professional–related newsgroups and the Microsoft


Knowledge Base can save you time and effort. The ideal situation is that your problem is a known
issue, complete with solutions or suggestions that point you in the right direction. See sources of
information shown in Table 27-1.

Table 27-1 Help and Information Sources

Source Description
Help and Support Provides access to troubleshooting tools, wizards, information, and links that
Centre cover a wide range of Windows XP Professional–related topics, including:

 Hardware devices, such as modems and network adapters


 Networking and the Internet
 Multimedia applications and devices
 E-mail, printing, and faxing
 Working remotely
 Remote assistance and troubleshooting
 System information and diagnostics
 Troubleshooting tools and diagnostic programs provided by
Windows XP Professional

To do a search using this feature, on the Start menu, click Help and Support.
Help Desk, Problem Technicians who have access to a wide range of information and history,
Management including common problems and solutions.
Department
International Sites that provide information for developing, troubleshooting, planning,
Technology organizing, and managing information technology (IT) services. The ITIL
Information Library Web site provides an online glossary of commonly used industry terms
(ITIL) and Microsoft used in IT-related documents. For more information, see the International
Operations Technology Information Library (ITIL) link and the Microsoft Operations
Framework (MOF) Framework (MOF) link on the Web Resources page at
Web sites http://www.microsoft.com/windows/reskits/webresources.
Internet Technical newsgroups such as those available at
newsgroups news://msnews.microsoft.com offer peer support for common computer
problems. You can exchange messages in an appropriate forum to request
or provide solutions and workarounds. Newsgroup discussions cover a wide
range of topics and provide valuable information that might help you track
down the source of your problem. Viewing newsgroup messages generally
requires newsreader software, such as Outlook Express. An alternative
approach is to use a Web-based newsgroup reader such as the one
available on the Microsoft Technical Communities Web site at
http://www.microsoft.com/communities.

12 | P a g e
Manufacturers’ Web sites offered by manufacturers of computers, peripherals, and
Web sites applications to provide Web support for their products.
Microsoft An extensive list of known problems and solutions that you can search. If you
Knowledge Base are unfamiliar with searching the Microsoft Knowledge Base, see article
242450, “How to Query the Microsoft Knowledge Base Using Keywords.” To
find this article and for more information about the Microsoft Knowledge
Base, see the Microsoft Knowledge Base link on the Web Resources page at
http://www.microsoft.com/windows/reskits/webresources.
Microsoft Product A Web site that contains technical information, useful links, downloads,
Support Services and answers to frequently asked questions (FAQs). To access the support
options available from Product Support Services, see the Microsoft Product
Support Services link on the Web Resources page at
http://www.microsoft.com/windows/reskits/webresources.
Microsoft TechNet A subscription-based service for IT professionals that enables you to search
technical content and topics about Microsoft products. For more
information about TechNet, see the Microsoft TechNet link on the Web
Resources page at
http://www.microsoft.com/windows/reskits/webresources.
Other online Many Web sites maintained by individuals and organisations provide
information Web troubleshooting information for Microsoft Windows 98, Microsoft
sites Windows Me, Microsoft Windows NT version 4.0, Microsoft Windows
2000, Microsoft Windows Server™ 2003, and Windows XP Professional.
Some of these Web sites specialize in hardware issues; others, in software.
Readme files Files that contain the latest information about the software or driver
installation media. Typical file names are “Readme.txt” or “Readme1st.txt.”
Reference books Reference books such as the Microsoft Windows 2000 Server Resource Kit,
the Microsoft Windows Server 2003 Deployment Kit, and the Microsoft
Windows Server 2003 Resource Kit provide helpful information for
diagnosing problems.
Technical support Technical support can help you solve a complex problem that might
otherwise require substantial research time.
Training Instructor-led or self-paced training can increase your troubleshooting
efficiency.
Windows Update A site that contains downloadable content, including current information
Web site about improving system compatibility and stability. For more information
about Windows Update, see the Windows Update link on the Web
Resources page at
http://www.microsoft.com/windows/reskits/webresources.

Before you apply a solution or workaround, or test an upgraded or updated application, use Backup
to back up your system. Backups allow you to restore the computer to the previous state if you are
not satisfied with the results. For information about backing up your system, see Chapter 14, “Backing
Up and Restoring Data.”

If your organisation has test labs to use, consider testing workarounds and updates in a lab
environment before applying them to multiple systems. For more information about software
compatibility testing, see “Avoid Common Pitfalls” later in this chapter.

13 | P a g e
Review Your System’s History

Review the history of your computer to know about recent changes, including all hardware and
software installed. If baseline or change records exist, look for information about new devices, new
applications, updated drivers, and change dates—as well as descriptions of the work done. If records
are not available, you can learn much about your computer by querying users and internal support
personnel or by using tools such as Device Manager and System Information.

Here are a few points to consider when reviewing the history of your computer:

 Did problems occur shortly after the installation of a particular application?


 Was a software update applied?

Microsoft technical support might provide a software update for an urgent or critical issue.
Software updates address a specific issue and might not be fully tested for compatibility. For
example, a software update that works for one computer might cause unwanted results in
another. Carefully read and follow the instructions before applying a software update.

 Did the problem occur soon after new hardware was installed?
 Why were hardware or software updates made?

Are the motherboard and peripheral firmware current? Can you establish a relationship
between the problem and the recent change?

 Are any non–Plug and Play devices installed?

If so, you can check for proper configuration by using hardware diagnostic programs and
Device Manager. Try replacing non–Plug and Play devices with hardware that is compatible
with Windows XP Professional.

 Was a new user recently assigned to the computer?

If so, review system history to determine whether the user has installed incompatible
hardware or software.

 When was the last virus check performed?

Does the virus scanning software incorporate the latest virus signature updates? For more
information about virus signature versions and updates, see the documentation provided with
your antivirus software.

 If a Service Pack is installed, is it the latest version?

To determine the version of a Service Pack

 In the Run dialog box, type winder

14 | P a g e
Compare System Settings and Configurations

If similar computers in your organisation are problem free when you are troubleshooting a problem,
you can use those problem-free computers as a reference for your root-cause analysis. The properly
functioning system can provide valuable baseline data. By comparing the following elements, you can
speed up the process of identifying contributing causes.

Installed services and applications

Generate a list of applications and services installed on the baseline computer to compare with
applications and services on the problem system. To gather a list of applications installed on your
system, use Add or Remove Programs in Control Panel. To gather a list of services enabled on your
system, use Services (Services. MSc) or System Information.

Tip Service Pack 2 for Windows XP adds a Show Updates check box to Add or Remove Programs that
lets you toggle between displaying or hiding installed updates such as security updates downloaded
from the Windows Update Web site.

Software revisions

Check the application and driver revisions to see whether differences exist between the two systems.
Update the problem system’s software to match the versions used on the problem-free system. For
applications, you can usually find version information by clicking Help and then clicking
About application name. For drivers, you can use Device Manager or System Information to find
version information.

System logs

Compare Event Viewer logs for problem indications such as signs of hardware stress. For example,
unexpected system shutdowns are logged with a “1076” event identification number in the System
event log. The associated descriptive text can provide essential information to diagnose the problem.
Baseline and problem systems might have similar problems, but the symptoms are more noticeable
on one computer because it performs a unique or very demanding role. For example, a server that
provides multimedia content typically consumes more system resources than a server that stores
infrequently used Microsoft Word documents. Problems with disk, audio, video, or network devices
and drivers typically appear earlier on computers that are stressed. Additionally, logging options for
most Windows XP Professional components exist, and these can help you with features such as
authentication, security, and remote access.

Hardware revisions

A minor hardware component upgrade might not be significant enough to cause a manufacturer to
change a product model number. Consider the following hypothetical scenario:

15 | P a g e
A computer company uses a revision 1.0 motherboard when assembling a Model ZZXZ1234 computer.
When reordering components, the company receives notice from the original equipment
manufacturer (OEM) that it plans to correct certain problems by substituting updated revision 1.1
motherboards. The computer company then incorporates the updated components into all Model
ZZXZ1234 computers. These minor changes might require you to exercise more care when updating
drivers or firmware in your Model ZZXZ1234 computers. For example, a support Web page for Model
ZZXZ1234 computers might post specific firmware versions, such as V3.0 for revision 1.0
motherboards and V4.0 for revision 1.1 and higher motherboards. Using firmware version V4.0 for
computers that use revision 1.0 motherboards might cause problems.

Check Firmware Versions

When you turn on or cycle power to a computer, the central processing unit (CPU) begins to carry out
programming instructions, or code, contained in the motherboard system firmware. Firmware—
known as basic input/output system (BIOS) on x86-based and x64-based computers and internal
adapters—contains operating system independent code necessary for the operating system to
perform low-level functions such as start-up self-tests and the initialization of devices required to start
Windows XP Professional. If instability or setup problems affect only a few Windows XP Professional–
based computers in your organisation, check the motherboard and peripheral firmware.

Motherboard firmware revisions

Compare the BIOS version on the problem and problem-free systems. If the versions differ, check the
computer manufacturer’s Web site for the latest firmware revisions. For example, if your firmware
revision A was stable, but upgrading to firmware revision B causes problems, you might find firmware
revision C on the Web site. If no revision C exists, temporarily downgrade to revision A until an update
becomes available.

Peripheral firmware revisions

It might be necessary to check peripheral firmware revisions and upgrade firmware for individual
peripherals, such as Small Computer System Interface (SCSI) adapters, CD and DVD-ROM drives, hard
disks, video cards, and audio devices. Peripheral firmware contains device-specific instructions, but it
is independent from the operating system. Peripheral firmware enables a device to perform specific
functions. Upgrading firmware can enhance performance, add new features, or correct compatibility
problems. In most cases, you can upgrade device firmware by using software the manufacturer
provides. Outdated motherboard system firmware can cause problems, especially for Advanced
Configuration and Power Interface (ACPI) systems.

OEMs periodically incorporate updated firmware into existing products to address customers’ issues
or to add new features. Sometimes similar computers using the same hardware components have
different motherboard and peripheral firmware versions. Upgrading firmware on older devices might
require you to replace components (such as electronic chips) or exchange the part for a newer version.
To avoid firmware problems, be sure to check the firmware revision your computer uses.

16 | P a g e
To check the firmware version on your computer

1. In the Run dialog box, in the Open box, type:

msinfo32

2. In the Item column, locate BIOS Version.

Compare the firmware version listed for your system against the most recent revision available on the
OEM’s Web site. Figure 27-2 shows an example of this.

Figure 27-2 Motherboard firmware revision in System Summary

Note Windows NT 4.0 Windows Diagnostics (Winmsd.exe) is not available in Windows XP Professional.
Typing winmsd from the command prompt now starts System Information, which contains similar
information.

To check whether your firmware is ACPI compliant

 In the Run dialog box, in the Open box, type:

devmgmt.msc

Figure 27-3 shows a Device Manager display for a computer that is not using ACPI features.

17 | P a g e
Figure 27-3 Non-ACPI computer in Device Manager

As shown in Figure 27-4, if the text ACPI appears under Computer, Windows XP Professional is using
ACPI functionality.

18 | P a g e
Figure 27-4 ACPI-compliant computer in Device Manager

Warning Failure to follow the manufacturer’s instructions for updating firmware might cause
permanent damage to your computer. If you are unfamiliar with this process, request assistance from
trained personnel. Back up important data before you attempt to upgrade your firmware in case you
are unable to start your computer.

During installation, Windows XP Professional checks system firmware to determine whether your
computer is ACPI compliant. This prevents system instability, which can manifest in symptoms from
hardware problems to data loss. If your system firmware does not pass all tests, it means that the ACPI
hardware abstraction layer (HAL) is not installed. If you are certain that your computer is equipped
with ACPI-compliant system firmware, but Windows XP Professional does not use ACPI features (the
computer is listed as a non-ACPI Standard PC, for example), contact the computer manufacturer for
updated motherboard firmware. After upgrading from non-ACPI to ACPI firmware, you must reinstall
Windows XP Professional to take advantage of ACPI features.

Caution If you attempt to override the default ACPI settings selected by Windows XP Professional,
setup problems occur. The remedy is reinstallation of the operating system.

19 | P a g e
Troubleshooting Strategies

After you observe symptoms, check technical information sources, and review your system’s history,
you might be ready to test a possible solution based on the information that you have gathered. If you
are unable to locate information that applies to your problem or find more than one solution that
applies, try to further isolate your problem by grouping observations into different categories such as
software-related symptoms (as a result of a service or application), hardware-related symptoms (by
hardware types), and error messages. Prioritize your list by frequency of occurrence, and eliminate
symptoms that you can attribute to user error. This enables you to methodically plan the diagnostic
steps to take or to select the next solution to try.

Isolate and Resolve Hardware Problems

When troubleshooting hardware, start with and work toward the simplest configuration possible by
disabling or removing devices. Then incrementally increase or decrease complexity until you isolate
the problem device. In safe mode, Windows XP Professional starts with only essential drivers and is
useful for diagnosing problems.

Check your hardware

If your diagnostic efforts point to a hardware problem, you can run diagnostic software available from
the manufacturer. These programs run self-tests that confirm whether a piece of hardware has
malfunctioned or failed and needs replacing. You can also install the device on different computers to
verify that the problem is not because of system-specific configuration issues. Replacing defective
hardware and diagnosing problems on a spare or test computer minimizes the impact on the user as
a result of the system being unavailable. If diagnostic software shows that the hardware is working,
consider upgrading or rolling back device drivers.

Reverse driver changes

If a hardware problem causes a Stop error that prevents Windows XP Professional from starting in
normal mode, you can use the Last Known Good Configuration start up option. The Last Known Good
Configuration enables you to recover from problems by reverting driver and registry settings to those
used during the last user session. If you are able to start Windows XP Professional in normal mode
after using the Last Known Good Configuration, disable the problem driver or device. Restart the
computer to verify that the Stop message does not recur. If the problem persists, repeat this
procedure until you isolate the hardware that is causing the problem.

Another method to recover from problems that occur after updating a device driver is to use Device
Driver Roll Back in safe or normal mode. If you updated a driver since installing Windows XP
Professional, you can roll back the driver to determine whether the older driver restores stability. If
another driver is not available, disable the device by using Device Manager until you are able to locate
an updated driver.

20 | P a g e
Using Device Manager to disable devices is always preferable to physically removing a part because
using Device Manager does not risk damage to internal components. If you cannot disable a device by
using Device Manager, uninstall the device driver, turn off the system, remove the part, and restart
the computer. If this improves system stability, the part might be causing or contributing to the
problem and you need to reconfigure it.

Isolate and Resolve Software Issues

If you suspect that a software problem or a recent change to system settings is preventing applications
or services from functioning properly, use safe mode to help diagnose the problem. You can also use
the Last Known Good start up option or System Restore to undo changes made by a recently installed
application, driver, or service. You can isolate issues by using the following methods.

Closing applications and processes

Close applications one at a time, and then observe the results. A problem might occur only when a
specific application is running. You can use Task Manager to end applications that have stopped
responding. For more information about ending applications and processes using Task Manager, see
Windows XP Professional Help and Support Centre..

Temporarily disabling services

By using the Services snap-in (Services. MSc) or the System Configuration Utility (Msconfig.exe), you
can stop and start most system services. For some services, you might need to restart the computer
for changes to take effect.

To isolate a service-related problem, you can choose to do the following:

 Disable services one at a time until the problem disappears.

You can then enable all other services to verify that you found the cause of the problem.

 Disable all non–safe mode services and then re-enable them one at time until the problem
appears.

Use the System Configuration Utility and boot logging to determine the services and drivers
initialized in normal and safe mode. You can then disable all non–safe mode drivers and re-
enable them one at a time until the problem returns.

Avoid Common Pitfalls

You can complicate a problem or troubleshooting process unnecessarily by acting too quickly. Avoid
the following common pitfalls that can hinder your efforts:

 Not adequately identifying the problem before taking action


 Not observing the effects of diagnostic changes

21 | P a g e
 Not documenting changes while troubleshooting
 Not restoring previous settings
 Troubleshooting several problems at one time
 Using incompatible or untested hardware
 Using incompatible software

Not Identifying the Problem Adequately

If you fail to make essential observations before responding, you can miss important information in
the critical moments when symptoms first appear. Here are some typical scenarios.

Failing to record information before acting

An error occurs and you start your research without recording important information such as the
complete error message text and the applications running. During your research, you check technical
information resources but find that you are unable to narrow the scope of your search because of
insufficient information.

Restarting the computer too soon

In response to frequent random errors users experience with a certain application, you restart the
affected computers without observing and recording the symptoms. Although users can resume work
for the day, a call to technical support later that day is less effective because you cannot reproduce
the problem. You must wait for the problem to recur before you can gather critical information needed
to determine the root cause. For example, symptoms can be caused by power surges, faulty power
supplies, excessive dust, or inadequate ventilation. Restarting the computer might be a temporary
solution that does not prevent recurrence.

Failing to check for scheduled maintenance events or known service outages

A user comes to work early and finds that network resources or applications are not responding. You
spend time troubleshooting the problem without success only to discover that both you and the user
failed to read e-mail announcing that scheduled maintenance would cause temporary early morning
outages.

Assuming that past solutions always work

Prior experience can shorten the time to solve a recurring problem because you already know the
remedy. However, the same solution might not always solve a problem that looks familiar. Always
verify the symptoms before acting. If your initial assumptions are incorrect and you misdiagnose the
problem, your actions might make the situation worse. Keep an open mind when troubleshooting.
When in doubt, verify your information by searching technical information sources (including technical
support) and obtain advice from experienced colleagues. Do not ignore new information, and question
past procedures that seem inappropriate.

22 | P a g e
Neglecting to check the basics

A user cannot print to a new local inkjet printer. You verify cable and power connections, check the
ink cartridge, and run the printer’s built-in diagnostics, but you find nothing wrong. Windows XP
Professional cannot detect the printer, so you manually install the most recent drivers without
success. Reinstalling Windows XP Professional does not solve the problem, and you later realize that
you neglected to find out whether printing to any local printer from this computer has ever been
successful. You find that the user has never tried this, and a firmware check reveals that the parallel
port is disabled. Enabling the parallel port resolves all printing problems.

Not Observing the Effects of Diagnostic Changes

System setting changes do not always take effect immediately. For example, when troubleshooting
replication issues, you must wait to observe changes. If you do not allow adequate time to pass, you
might prematurely conclude that the change was not effective. To avoid this situation, familiarize
yourself with the feature that you are troubleshooting and thoroughly read the information provided
by technical support before judging the effectiveness of a workaround or update.

Not Documenting Changes while Troubleshooting

Documenting the steps that you take while troubleshooting allows you to review your actions after
you have resolved the problem. This is useful for very complex problems that require lengthy
procedures to resolve. Documenting your steps allows you to verify that you are not duplicating or
skipping steps, and it enables others to assist you with the problem. It also allows you to identify the
exact steps to take if the problem recurs and enables you to evaluate the effectiveness of your efforts.

Not Restoring Previous Settings

If disabling a feature or changing a setting does not produce the results you want, restore the feature
or setting before trying something else. For example, record firmware settings before changing them
to diagnose problems. Not restoring settings can make it difficult to determine which of your actions
resolved the problem. When verifying solutions that require you to make extensive changes or restart
the computer multiple times, perform backups before troubleshooting so that you can restore the
system if your actions are ineffective or cause start-up problems.

Review backup procedures

Backups are essential for all computers, from personal systems to high-availability servers. If you
suspect that your troubleshooting efforts might worsen the problem or risk important data, perform
a backup. This enables you to restore your system if you experience data loss, Stop errors, or other
start-up problems. Backups allow you to partially or completely restore the system and continue
where you left off. When you evaluate or create backup procedures, consider the following:

 Use the verification option of your backup software to check that your data is correctly written
to backup media.

23 | P a g e
 Routinely check the age and condition of backup media, and follow the manufacturer’s
recommendations for using backup media.
 Follow the hardware manufacturer’s recommendations for maintaining the backup device.

Windows XP Professional also provides other ways to restore system settings such as System Restore
and the Last Known Good Configuration start-up option.

Troubleshooting Several Problems at One Time

If multiple problems affect your system, avoid troubleshooting them as a group. Instead, identify
shared symptoms, and then isolate and treat each separately. For example, faulty video memory can
cause Stop messages, corrupted screen images, and system instability. While diagnosing the
symptoms, you might find that errors occur only with multimedia applications that use advanced
three-dimensional rendering. When you attempt to rule out the possibility of failed video hardware
by replacing the VGA adapter, you might find that this action also resolves the other issues.

Using Incompatible or Untested Hardware

For many organisations, standards for selecting hardware and purchasing new systems and
replacement parts do not exist, are not fully defined, or are simply ignored. Standards that are well
defined, refined, maintained, and followed can reduce hardware variability and optimize
troubleshooting efforts.

If you need to replace hardware, record your troubleshooting actions as thoroughly as possible. Before
installing a new device or replacement part, verify that it is in the Windows CatLog at
http://www.microsoft.com/windows/catalog, that the firmware version for the system motherboard
and devices are current, and that any replacement part is pretested or “burned-in” before
deployment.

Checking the Windows catalogue

Hardware problems can occur if you use devices that are not compatible with Windows XP
Professional. The Windows Catalog is a Web-based searchable database of hardware and software
that have been certified under the Designed for Windows XP Logo Program. The Windows catalogue
outlines the hardware components that have been tested for use with Windows XP Professional and
is continuously updated as additional hardware is tested and approved.

Tip While the Windows Catalog replaces the Hardware Compatibility List (HCL) used for previous
Microsoft Windows platforms, you can still access text-only versions of the HCL for different Windows
versions from Windows Hardware and Driver Central at

http://www.microsoft.com/sql/prodinfo/previousversions/winxpsp2faq.mspx.

If several variations of a device are available from one manufacturer, it is best to select only models
listed in the Windows Catalog.

24 | P a g e
Table 27-2 explains the differences between Windows XP logo designations. For more information,
see “Logo Program Options for Application Software” at

http://www.microsoft.com/winlogo/software/SWprograms.mspx.

Table 27-2 Designed and Compatible Designations in the Windows Catalog

HCL Designation Description


Designed for Indicates that this product has been specifically designed to take advantage of
Windows XP the new features in Windows XP.
Compatible with Indicates that this product has not met all the Designed for Windows XP Logo
Windows XP Program requirements but has nevertheless been deemed compatible with
Windows XP by Microsoft or the manufacturer.

When you upgrade to Windows XP Professional, device hardware resource settings are not migrated.
Instead, all devices are redetected and enumerated during installation. Typically, upgrades to
Windows XP Professional follow this migration path:

 An upgrade to Windows XP Professional from Windows 98, Microsoft Windows Millennium


Edition (Me), Microsoft Windows NT 4.0 Workstation, or Windows 2000 Professional.

You might find after installation that devices that functioned before the upgrade behave differently
or do not work after the upgrade. This problem might have occurred because of the following reasons:

 A driver for the device is not on the Windows XP Professional operating system CD, and Device
Manager lists it as unknown or conflicting hardware.
 Windows XP Professional Setup installed a generic driver that might be compatible with your
device, but it does not fully support enhanced features. Many hardware manufacturers also
provide tools that add value to their products, but they are not available in Windows XP
Professional. Windows XP Professional Setup installs the basic feature set needed to enable
your product to function. For additional software that enhances functionality or adds
additional features, download the latest Windows XP Professional compatible drivers and
tools from the manufacturer’s Web site.

Do not attempt to re-install older drivers because doing so might cause system instability, start-up
problems, or Stop errors and other start-up problems. For more information about troubleshooting
Stop errors and start-up problems.

For best results, always use Designed for Windows XP certified devices. It is especially important to
refer to the Windows Catalog before purchasing modems, tape backup units, and SCSI adapters. If you
must use non-certified hardware, check the manufacture’s Web site for the latest updated device
driver.

Note If your system has noncertified hardware installed, uninstall drivers for these devices before
installing Windows XP Professional. If you cannot complete setup, remove the hardware from your
system temporarily and rerun Setup.

25 | P a g e
Testing new and replacement parts

If you must replace or upgrade older parts with newer ones, first purchase a small number of new
parts and conduct performance, compatibility, and configuration tests before doing a general
deployment. The evaluation is especially important when a large number of systems are involved, and
it might lead you to consider similar products from other manufacturers.

When replacing devices, use pretested or burned-in parts whenever possible. A burn-in involves
installing an electronic component and observing it several days for signs of abnormal behaviour.
Typically, computer components fail early or not at all, and a burn-in period reveals manufacturing
defects that lead to premature failure. You can choose to do additional testing by simulating worst-
case conditions. For example, you might test a new hard disk by manually copying files or creating a
batch file that repeatedly copies files, filling the disk to nearly full capacity.

Using Incompatible Software

Before installing software on multiple computers, test it for compatibility with existing applications in
a realistic test environment. Observe how the software interacts with other programs and drivers in
memory. If only the test application and the operating system are active, testing does not provide a
realistic or valid indication of compatibility or performance. Testing is necessary even if a
manufacturer guarantees full Windows XP Professional compatibility, because older programs might
affect new software in unpredictable ways.

For large organisations, consider limited pre-deployment test rollouts to beta users who can provide
real-world feedback. Select testers who have above-average computer skills to get technically
accurate descriptions of problems they observe.

Setup and stability criteria are equally important in evaluating software and hardware for purchase.
Testing is critical for upgrading systems from earlier versions of Windows such as Windows 98 or
Windows NT 4.0. Software and drivers that were installable and stable on earlier versions of Windows
might exhibit problems or not function in the Windows XP Professional environment. Video, sound,
and related multimedia drivers and tools (such as audio, CD-ROM mastering, and DVD playback
software) are especially sensitive to operating system upgrades.

Document and Evaluate the Results

You can increase the value of information collected during troubleshooting by keeping accurate and
thorough records of all work done. You can use your records to reduce redundant effort and to avoid
future problems by taking preventive action.

Create a configuration management database to record the history of changes, such as installed
software and hardware, updated drivers, replaced hardware, and altered system settings. Periodically
verify, update, and back up this data to prevent permanent loss. To maximize use of your database,
note details such as:

26 | P a g e
 Changes made
 Times and dates of changes
 Reasons for the changes
 Users who made the changes
 Positive and negative effects the changes had on system stability or performance
 Information provided by technical support

When planning this database, keep in mind the need to balance scope and detail when deciding which
items or attributes to track. For more information, see the International Technology Information
Library (ITIL) and Microsoft Operations Framework (MOF) Web site links provided in “Check Technical
Information Resources” earlier in this chapter.

Update baseline information after installing new hardware or software to compare past and current
behaviour or performance levels. If previous baseline information is not available, use System
Information, Device Manager, the Performance tool, or industry standard benchmarks to generate
data.

Baselines combined with records kept over time enable you to organize experience gained, evaluate
maintenance efforts, and judge troubleshooting effectiveness. Analysis of this data can form the basis
of a troubleshooting manual or lead to changes in control policy for your organisation.

A post-troubleshooting review, or post-mortem, can help you pinpoint troubleshooting areas that
need improvement. Some questions you might consider during this self-evaluation period include:

 What changes improved the situation?


 What changes made the problem worse?
 Was system performance restored to expected levels?
 What work was redundant or unnecessary?
 How effectively were technical support resources used?
 What other tools or information not used might have helped?
 What unresolved issues require further root-cause analysis?

Write an Action Plan

An action plan is a set of relevant troubleshooting objectives and strategies that fits within your
organisation’s configuration and management strategies. After you identify the problem and find a
potential solution or workaround that you have tested on one or more computers, you might need an
action plan if the solution is to be deployed across your organisation, possibly involving hundreds or
thousands of computers. Coordinate your plan with supervisors and staff members in the affected
areas to keep them informed well in advance and to verify that the schedule does not conflict with
important activity. Include provisions for troubleshooting during nonpeak work hours or dividing work
into stages over a period of several days. Evaluate your plan, and as you uncover weaknesses, update
it to increase its effectiveness and efficiency.

As the number of users grows, the potential loss of productivity as a result of disruption increases.
Your plan must account for dependencies and allow last-minute changes. Factor in contingency plans
for unforeseen circumstances.

27 | P a g e
For more information about creating a configuration management database, see the ITIL and MOF
links listed in Table 27-1.

Take Proactive Measures

You can combine information gathered while troubleshooting major and chronic problems to create
a proactive plan to prevent or minimize problems for the long term. When planning a maintenance or
upgrade process for your organisation, consider the following goals:

 Improving the computing environment


 Monitoring system and application logs
 Documenting hardware and software changes
 Anticipating hardware and software updates

Improve the Computing Environment

External factors can have a major impact on the operation and lifespan of a computer. Some basic
precautions include labelling connecting cables, periodically testing uninterruptible power supply
(UPS) batteries, and placing computers far from high-traffic areas where they might be bumped or
damaged. It is important to check environmental factors such as room temperature, humidity, and air
circulation to prevent failures from excessive heat. Dust can clog cooling equipment such as computer
fans and cause them to fail. Install surge suppressors, dedicated power sources, and backup power
devices to protect equipment from electrical current fluctuations, surges, and spikes that can cause
data loss or damage equipment. Other precautions include:

 Performing regular file and system state backups to prevent data loss. For more information
about Windows Backup, see Windows XP Professional Help and Support Center and Chapter
14, “Backing Up and Restoring Data.”
 Using Windows XP Professional–compatible virus-scanning software and regularly
downloading the latest virus signature updates. A virus signature data file contains
information that enables virus-scanning software to identify infected files.

Monitor System and Application Logs

Monitor your system to detect problems early and avoid having software or hardware failure be your
first or only warning of a problem. When using a monitoring tool such as Performance (Perfmon.msc)
to evaluate changes that might affect performance, compare baseline information to current
performance. The resulting data helps you isolate bottlenecks and determine whether actions such as
upgrading hardware, updating applications, and installing new drivers are effective. You can also use
the data to justify expenditures, such as additional CPUs, more RAM, and increased storage space.
Checking the Event Viewer regularly helps you to identify chronic problems and detect potential
failures. This allows you to take corrective action before a problem worsens. For more information
about monitoring your system, see “Overview of Performance Monitoring” in the Operations Guide of
the Microsoft Windows 2000 Server Resource Kit.

28 | P a g e
Document Changes to Hardware and Software

In addition to recording computer-specific changes, do not neglect to record other factors that directly
affect computer operation, such as Group Policy and network infrastructure changes. For more
information about developing and implementing a standard process for recording configuration
changes, see “Document and Evaluate the Results” earlier in this chapter.

Plan for Hardware and Software Upgrades

Regardless of how advanced your system hardware or software is at the time of purchase, computer
technologies have a limited lifespan. Your maintenance plan must account for the following factors
that can make updates and upgrades necessary.

Increased demand for computing resources

When computing needs grow beyond the capability of your hardware, it makes sense to upgrade
hardware components or entire systems. Performance degradation might be the result of system
bottlenecks caused by hardware that has reached maximum capacity. Optimizing drivers and updating
applications can help in the short term, but user demand for computing resources eventually makes
it necessary to upgrade to more powerful hardware.

Discontinued support for a device or software

Operating system or manufacturer support for a device or software might be discontinued, causing
compatibility issues that can block upgrades to new operating systems or prevent full use of certain
features in Windows XP Professional. To minimize effort when upgrading hardware and software for
many computers, purchase similar computers and follow replacement standards for your
organisation. Failure to standardize applications and hardware can make upgrading more difficult and
expensive, especially if technicians and users need retraining.

Added capabilities

Having a process for upgrading operating systems or installing application patches, software updates,
and operating system Service Packs helps to maintain the stability, performance, and reliability of your
equipment. Schedule time to stay current with new developments and product updates.

Establishing a Troubleshooting Checklist

A guaranteed “system” for troubleshooting all computer-related problems does not exist. Effective
troubleshooting requires technical research and experience, careful observation, resourceful use of
information, and patience. During the troubleshooting process, you can consult the checklist in Table
27-3.

29 | P a g e
Table 27-3 Troubleshooting Checklist

Task Action
Identify problem Observe the symptoms:
symptoms.
 Under what conditions does the problem occur?
 Which aspects of the operating system control these conditions?
 What applications or subsystems does the problem seem related
to?
 Record all error information for future reference, including the
exact message text and error numbers.

Do not forget to check the basics:

 Verify that the power cables are properly connected and are not
damaged or worn.
 Check firmware settings to verify that devices are enabled.

Check technical Research the problem:


information
resources.  What actions were tried for this or similar problems in the past?
 Is this a known issue for which a solution or workaround exists?
What were the results?
 What information is available from product documentation,
internal support sources, or outside resources, such as a
manufacturer’s Web site or newsgroups?
 What information can you obtain from support staff, such as Help
Desk, or other users who might have experienced similar
problems?

Review your system’s Analyze the events that led up to the problem:
history.
 What happened just before the problem occurred?
 What hardware was recently installed? Are driver and firmware
revisions current?
 What software or system file updates were made? Are the software
revisions current?
 Does the software and hardware configuration match the
documented configuration? If not, try to determine the
differences.
 Did you examine the event logs for clues to the problem?
 Gather baseline information or compare to a reference system:
 Did this application or hardware work correctly in the past? What
has changed since then?
 Does the application or hardware work correctly on another
computer? If so, what is different on that computer?

30 | P a g e
 Generate performance data by using the Performance tool or
benchmark programs. If previous baselines exist, compare current
and past performance.

Document and Record the results:


evaluate the results.
 Use a common report format such as a database to record
information.
 Make a detailed record of all the work done to correct the problem
for future reference.
 Record who, what, when, and why—and identify positive and
negative cause and effect.
 Evaluate the results:
 Was the work done efficiently?
 Was the solution effective? What remains unresolved?
 When a solution was implemented, was system performance
restored to expected levels?
 What processes can be changed or implemented to prevent the
problem from recurring?
 Are systems being adequately monitored? Can this problem be
caught early if it happens again?
 What additional information, tools, or tests are needed?

Activity 1

What should be your first step in troubleshooting a hardware problem? Why should this be the
first step?

31 | P a g e
Activity 1

Now that you are so well equipped, you are ready to embark on the systematic journey. There are two
major stages in the troubleshooting process. The first stage is identifying the issue. The second stage
is performing the actual repair (or taking other steps that identifying the issue has made clear).

To identify the issue, you must:

1. Gather information.
2. Verify the issue.
3. Try quick fixes.
4. Use appropriate diagnostics.
5. Perform a split-half search.
6. Use additional resources to research the issue.
7. Escalate the issue (if necessary).

32 | P a g e
After you have identified the issue, you must:

1. Repair or replace the faulty item.


2. Verify the repair by testing the product thoroughly.
3. Inform the user of what you have done.
4. Complete administrative tasks (yes, really).

We will review these processes in detail in the following sections of this lesson. First, we'll give you
the overview in chart form.

General Troubleshooting Flowchart

Keeping the steps of the troubleshooting process straight is sometimes difficult for new technicians.
Apple has produced a General Troubleshooting Flowchart that you can use as a reference.

33 | P a g e
Analyse and document the system that requires troubleshooting3
Troubleshooting
Troubleshooting is perhaps the most difficult task that computer professionals face. Added to the
need to get to the bottom of a problem afflicting the network is the pressure to do so as quickly as
possible. Computers never seem to fail at a convenient time. Failures occur in the middle of a job or
when there are deadlines, and pressures to fix the problem immediately are intense.

After a problem has been diagnosed, locating resources and following the procedures required to
correct the problem are straightforward. But before that diagnosis occurs, it is essential to isolate the
true cause of the problem from irrelevant factors.

Troubleshooting is more of an art form than an exact science. However, to be efficient and effective
as a trouble shooter, you must approach the problem in an organized and methodical manner.
Remember that you are looking for the cause, not its symptoms; yet frequently, problems as originally
reported are just symptoms and not the true cause. As a trouble-shooter you need to learn to quickly
and confidently eliminate as many alternative causes as possible. This will allow you to focus on the
things that might be the cause of the problem. To do this, you must take a systematic approach.

The process of troubleshooting a computer network problem can be divided into five steps.

Step 1: Defining the Problem

The first phase is the most critical, yet most often ignored. Without a complete understanding of the
entire problem, you can spend a great deal of time working on the symptoms, without getting to the
cause. The only tools required for this phase are a pad of paper, a pen (or pencil), and good listening
skills.

3
Source: KSI, as at http://pluto.ksi.edu/~cyh/cis370/ebook/ch13b.htm, as on 9th December, 2016.

34 | P a g e
Listening to the client or network user is your best source of information. Remember that while you
might know how the network functions and be able to find the technical cause of the failure, those
operating the network on a daily basis were there before and after the problem started and probably
recall the events that led up to the failure. By drawing on their experience with the problem, you can
get a head start on narrowing down the possible causes. To help identify the problem, list the
sequence of events, as they occurred, before the failure. You might want to create a form with these
questions (and others specific to the situation) to help organize your notes.

Some general questions to ask might include:

 When did you first notice the problem or error?


 Has the computer recently been moved?
 Have there been any recent software or hardware changes?
 Has anything happened to the workstation? Was it dropped or was something dropped on it?
Were coffee or soda spilled on the keyboard?
 When exactly does the problem or error occur? During the start-up process? After lunch? Only
on Monday mornings? After sending an e-mail message?
 Can you reproduce the problem or error? If so, how do you reproduce the problem?
 What does the problem or error look like?
 Describe any changes in the computer (such as noises, screen changes, disk activity lights).

Users—even those with little or no technical background—can be helpful in collecting information if


they are questioned effectively. Ask users what the network is doing or not doing that makes them
think it's not functioning correctly. User observations that can be clues to the underlying cause of a
network problem include the following:

 "The network is really slow."


 "I cannot connect to the server."
 "I was connected to the server but I lost the connection."

35 | P a g e
 "One of my applications will not run."
 "I cannot print."

As you continue to ask questions, you can begin to narrow your focus, as the following list illustrates:
 Are all users affected or only one?

If only one user has a problem, the user's workstation is probably the cause.
 Are the symptoms constant or intermittent?

Intermittent symptoms are a sign of failing hardware.


 Did the problem exist before an operating system upgrade?

Any change in operating system software can cause new problems.


 Does the problem appear with all applications or only one?

If only one application causes problems, focus on the application.


 Is this problem similar to a previous problem?

If a similar problem occurred in the past, there might be a documented solution.


 Are there new users on the network?

Increased traffic can cause logon and processing delays.


 Is there new equipment on the network?

Check to verify that new network equipment has been correctly configured.
 Was a new application installed before the problem occurred?

Installation and training issues can cause application problems.


 Has any of the equipment been moved recently?

The moved equipment might not be connected to the network.


 Which manufacturers' products are involved?

Some vendors offer telephone, online, or onsite support.


 Is there a history of incompatibility among certain vendors and certain components such as
cards, hubs, disk drives, software, or network operating software?

There might be a documented solution on the vendor's Web site.


 Has anyone else attempted to solve this problem?

Check for documented repairs and ask co-workers about attempted repairs.

Step 2: Isolating the Cause

The next step is to isolate the problem. Begin by eliminating the most obvious problems and work
toward the more complex and obscure. Your purpose is to narrow your search down to one or two
general categories.

36 | P a g e
Be sure to observe the failure yourself. If possible, have someone demonstrate the failure to you. If it
is an operator-induced problem, it is important to observe how it is created, as well as the results.

The most difficult problems to isolate are those which are intermittent and that never seem to occur
when you are present. The only way to resolve these is to re-create the set of circumstances that cause
the failure. Sometimes, eliminating causes that are not the problem is the best you can do. This
process takes time and patience. The user also needs to keep detailed records of what is being done
before and when the failure occurs. It can help to tell the user to refrain from doing anything with the
computer when the problem recurs, except to call you. That way, the "evidence" won't be disturbed.

While the information collected provides the foundation for isolating the problem, the administrator
should also refer to documented baseline information to compare with current network behaviour.
Now it is time to put that knowledge to work. Rerun tests under the same set of conditions as prevailed
when you created the baseline, then compare the two results. Any changes between the two can
indicate the source of the problem.

Information gathering involves scanning the network and looking for an obvious cause of the problem.
A quick scan should include a review of the documented history of the network to determine if the
problem has occurred before and, if so, whether there is a recorded solution.

Step 3: Planning the Repair

After you have narrowed your search down to a few categories, the final process of elimination begins.

Create a planned approach to isolating the problem based on your knowledge at this point. Start by
trying out the most obvious or easiest solution to eliminate and continue toward the more difficult
and complex. It is important to record each step of the process; document every action and its results.

After you have created your plan, it is important to follow it through as designed. Jumping ahead and
randomly trying things out of order can often lead to problems. If the first plan is not successful (always
a possibility), create a new plan based on what you discovered with the previous plan. Be sure to refer
to, re-examine, and reassess any assumptions you might have made in the previous plan.

After you have located the problem, either repair the defect or replace the defective component. If
the problem is software-based, be sure to record the "before" and "after" changes.

Step 4: Confirming the Results

No repair is complete without confirmation that the job has been successfully concluded. You need to
make sure that the problem no longer exists. Ask the user to test the solution and confirm the results.
You should also make sure that the fix did not generate new problems. Be sure to confirm not only
the problem you fixed, but also that what you have done has not had a negative impact on any other
aspect of the network.

37 | P a g e
Step 5: Documenting the Outcome

Finally, document the problem and the repair. Recording what you've learned will provide you with
invaluable information. There is no substitute for experience in troubleshooting, and each new
problem presents you with an opportunity to expand that experience. Keeping a copy of the repair
procedure in your technical library can be useful when the problem (or one like it) occurs again.
Documenting the troubleshooting process is one way to build, retain, and share experience.

Remember that any changes you have made might have affected the baseline. You might need to
update the network baseline in anticipation of future problems and needs.

Segmenting the Problem


If the initial review of network statistics and symptoms does not expose an obvious problem, dividing
the network into smaller parts to isolate the cause is the next step in the troubleshooting process. The
first question to ask is whether the problem stems from the hardware, or the software. If the problem
appears to be hardware-based, start by looking at only one segment of the network, then looking at
only one type of hardware.

Check the hardware and network components including:

 NICs.
 Cabling and connectors.
 Clients/workstations.
 Connectivity components such as repeaters, bridges, routers, brouters, and gateways.
 Hubs.
 Protocols.
 Servers.
 Users.

Often, isolating or removing a portion of the network will help to get the rest of the network up and
operational again. If removing a portion solved the problem for the rest of the network, the search for
the problem can be focused on the part that was removed.

Network protocols require special attention because they are designed to bypass network problems
and attempt to overcome network faults. Most protocols use what's known as "retry logic," in which
the software attempts an automatic recovery from a problem. This becomes noticeable through slow
network performance as the network makes new and repeated attempts to perform correctly. Failing
hardware devices, such as hard drives and controllers, will use retry logic by repeatedly interrupting
the CPU for more processing time to complete their task.

When you are assessing hardware performance problems, use the information obtained from the
hardware baselines to compare against the current symptoms and performance.

38 | P a g e
Isolating the Problem
After you have gathered the information, rank the list of possible causes in order, beginning with the
most likely and moving to the least likely cause of the problem. Then select the most likely candidate
from the list of possible causes, test it and see if that is the problem. Start from the most obvious and
work to the most difficult. For example, if you suspect that a faulty network interface card (NIC) in one
of the computers is the cause of the trouble, replace it with a NIC that is known to be in good working
order.

Setting Priorities
A fundamental element in network problem solving is setting priorities. Everyone wants his or her
computer fixed first, so setting priorities is not an easy job. While the simplest approach is to prioritize
on a "first come, first served" basis, this does not always work, as some failures are more critical to
resolve than others. Therefore, the initial step is to assess the problem's impact on the ability to
maintain operations. For example, a monitor that is gradually getting fuzzy over several days would
have a lower priority to address than the inability to access the payroll file server prior to a check run.

Activity 2

Given the following scenario, describe how you would research, identify, prioritize, and resolve
this network problem:

The network has been running well at the site of a small manufacturer. However, a user in the
quality control division now calls to report that she is unable to get the daily status reports printed
by the printer in the department. Meanwhile, the shipping department reports that a rerouted
print job did not print in the quality control department. What is your strategy for solving this
network problem?

39 | P a g e
Activity 2

Gather Information

It is important to know as much as you can about the situation before you jump headlong into trying
to fix it. Gathering information is the first step in successful troubleshooting.

If the Mac is functional, run System Profiler (discussed in Lesson 2, “Software Tools”) to compile useful
technical information on the Mac and its components.

In some cases, the customer is available to explain the nature of the situation. In those cases, the
following tips will assist you in getting accurate and useful information from your customer.

40 | P a g e
Ask Probing Questions

When you question a customer, it is very important to understand that the customer is generally not
happy with his or her situation. Your courtesy and professionalism will make the circumstances better
for the customer and enable you to gather information needed to repair the customer's product. Be
patient. Be polite. Be constantly aware that you are there to help the customer.

Furthermore, be aware that a customer coming to you for help with a situation probably will not share
your level of technical expertise and may therefore use incorrect or imprecise terminology. Try to talk
to customers at their level rather than attempt to impress them with your knowledge and mastery of
all things technical.

The following are tips you should follow when gathering information from customers:

1. Start with open questions such as “What is the issue?” Open questions generally start with
words like how, why, when, who, what, and where. They cannot be answered with “yes” or
“no.”
2. Let customers explain in their own words what they have experienced. Do not interrupt the
customer—interrupting generally prompts someone to start over.
3. As you begin to understand the basics of the issue, start using closed questions that require
more limited, specific answers. “What operating system are you using?” is an example of a
closed question. The customer will either tell you what the Mac OS version is or tell you that
he or she does not know. Closed questions often can be answered with “yes” or “no.”
4. Verify your understanding of what the customer has told you. Restate what you have been
told and get the customer's agreement that you understand the issue. An example of
restatement would be, “Okay, so what's happening is that when you try X, Y happens. Is that
correct?”
5. If the customer agrees that you understand, continue to gather information. If the customer
does not agree that you understand, clarify what you misstated and again verify your
understanding. Do not continue until the customer agrees that you understand the issue.

Verify the Issue

Verifying the issue is extremely important in successful troubleshooting. It gives you a chance to
objectively confirm the extent and the nature of the situation. In the long run, it saves time since you
do not waste time working on the wrong issue.

Eliminate Third-Party Products

Third-party product incompatibilities can be the source of the issue. Before starting to isolate the
suspected issue with Apple equipment or software, eliminate third-party products from the system.
To the extent that you are able, make the system “all Apple” by:

 Disabling third-party software extensions in the System Folder if the computer has Mac OS 9
(choose the Mac OS Base set in the Extensions Manager control panel)
 Disconnecting third-party SCSI, USB, or FireWire peripherals

41 | P a g e
 Disconnecting third-party keyboards, mice, and other input devices
 Removing third-party PCI, AGP, or NuBus interface cards
 Removing third-party RAM (if possible)
 Disconnecting any other third-party hardware

If the issue does not occur with your “all Apple” system, the issue is most likely with the third-party
products you removed. You can proceed with troubleshooting, but be aware that additional technical
assistance may have to come from the third-party manufacturer.

Remember, the reason you are taking the third-party products out of the equation is simply to be able
to analyze an isolated “all Apple” system that is similar, if not identical, to the systems presumed
throughout this book. The goal is not to point the finger at the third-party product and walk away from
the situation. Customers bought the third-party products for a reason, and they expect to be able to
use those third-party products in conjunction with their Apple equipment. It's your job to help
customers achieve that goal.

Identify available fault finding tools and determine the most appropriate for
the identified problem
Troubleshooting network problems is often accomplished with the help of hardware and software. To
troubleshoot effectively, you need to know how these tools can be used to solve network problems.

Hardware Tools
Hardware tools were once very expensive and difficult devices to use. They are now less expensive
and easier to operate. They are helpful to identify performance trends and problems. This section
describes the most common of these tools.

Digital Voltmeters

The digital voltmeter (volt-ohm meter) is the primary all-purpose electronic measuring tool. It is
considered standard equipment for any computer or electronic technician and can reveal far more
than just the amount of voltage passing through resistance. Voltmeters can determine if:

 The cable is continuous (has no breaks).


 The cable can carry network traffic.
 Two parts of the same cable are exposed and touching (thereby causing shorts).
 An exposed part of the cable is touching another conductor, such as a metal surface.

One of the network administrator's most important functions is to confirm source voltage for the
network equipment. Most electronic equipment operates on 120 volts AC. But not all outlets will meet
these requirements. In older installations, especially in large industrial areas, the system load can drop
voltages to as low as 102 volts.

42 | P a g e
Operating for long periods at low voltages can cause electronic equipment problems. Low voltages
often cause intermittent faults. At the other end, voltage that is too high can cause immediate damage
to the equipment. In new construction, it is possible for circuits to be wired incorrectly and actually
put out 220 volts AC.

NOTE

With any new location or new construction, it is important to check the outlet voltages before
connecting any electronic equipment in order to verify that they are within an acceptable range.

Time-Domain Reflectometers (TDRs)

Time-domain reflectometers (TDRs), as shown in Figure 13.1, send sonar-like pulses along cables to
locate breaks, shorts, or imperfections. Network performance suffers when the cable is not intact. If
the TDR locates a problem, the problem is analysed and the results are displayed. A TDR can locate a
break within a few feet of the actual separation in the cable. Used heavily during the installation of a
new network, TDRs are also invaluable in troubleshooting and maintaining existing networks.

Figure 13.1 Time-domain reflectometer

Using a TDR requires special training, and not every maintenance department will have this
equipment. However, administrators need to know the capabilities of TDRs in case the network is
experiencing media failure and it is necessary to locate a break.

Advanced Cable Testers

Advanced cable testers work beyond the physical layer of the OSI reference model in the data-link
layer, network layer, and even the transport layer. They can also display information about the
condition of the physical cable.

Oscilloscopes

Oscilloscopes are electronic instruments that measure the amount of signal voltage per unit of time
and display the result on a monitor. When used with TDRs, an oscilloscope can display:

43 | P a g e
 Shorts.
 Sharp bends or crimps in the cable.
 Opens (breaks in the cable).
 Attenuation (loss of signal power).

Other Hardware Tools

Several other versatile hardware tools can serve as useful aids to network troubleshooting.

Crossover Cables

Crossover cables are used to connect two computers directly with a single patch cable. Because the
send and receive wires are reversed on one end, the send wire from one computer is connected to
the receive port on the other computer. Crossover cables are useful in troubleshooting network
connection problems. Two computers can be directly connected, bypassing the network and making
it possible to isolate and test the communication capabilities of one computer, rather than the whole
network.

Hardware Loopback

A hardware loopback device is a serial port connector that enables you to test the communication
capabilities of a computer's serial port without having to connect to another computer or peripheral
device. Instead, using the loopback, data is transmitted to a line, then returned as received data. If the
transmitted data does not return, the hardware loopback detects a hardware malfunction.

Tone Generator and Tone Locator

Tone generators are standard tools for wiring technicians in all fields. A tone generator is used to apply
an alternating or continuous tone signal to a cable or a conductor. The tone generator is attached to
one end of the cable in question. A matching tone locator is used to detect the correct cable at the
other end of the run.

These tools are also able to test for wiring continuity and line polarity. They can be used to trace
twisted-pair wiring, single conductors, and coaxial cables, among others. This pair of equipment is
sometimes referred to as "fox and hound."

Software Tools
Software tools are needed to monitor trends and identify network performance problems. This section
describes some of the more useful of these tools.

44 | P a g e
Network Monitors

Network monitors are software tools that track all or a selected part of network traffic. They examine
data packets and gather information about packet types, errors, and packet traffic to and from each
computer.

Network monitors are very useful for establishing part of the network baseline. After the baseline has
been established, you will be able to troubleshoot traffic problems and monitor network usage to
determine when it is time to upgrade. As an example, let's assume that after installing a new network,
you determine that network traffic is utilized at 40 percent of its intended capacity. When you check
traffic, one year later, you notice that it is now being utilized at 80 percent capacity. If you had been
monitoring it all along, you would have been able to predict the rate of increased traffic and predict
when to upgrade before failure occurs.

Protocol Analyzers

Protocol analyzers, also called "network analyzers," perform real-time network traffic analysis using
packet capture, decoding, and transmission data. Network administrators who work with large
networks rely heavily on the protocol analyzer. These are the tools used most often to monitor
network interactivity.

Protocol analyzers look inside the packet to identify a problem. They can also generate statistics based
on network traffic to help create a picture of the network, including the:

 Cabling.
 Software.
 File servers.
 Workstations.
 Network interface cards.

Protocol analyzers have built-in TDRs, discussed in the previous section.

The protocol analyzer can provide insights and detect network problems including:

 Faulty network components.


 Configuration or connection errors.
 LAN bottlenecks.
 Traffic fluctuations.
 Protocol problems.
 Applications that might conflict.
 Unusual server traffic.

45 | P a g e
Protocol analyzers can identify a wide range of network behavior. They can:

 Identify the most active computers.


 Identify computers that are sending error-filled packets. If one computer's heavy traffic is
slowing down the network, the computer should be moved to another network segment. If a
computer is generating bad packets, it should be removed and repaired.
 View and filter certain types of packets. This is helpful for routing traffic. Protocol analyzers
can determine what type of traffic is passing across a given network segment.
 Track network performance to identify trends. Recognizing trends can help an administrator
better plan and configure the network.
 Check components, connections, and cabling by generating test packets and tracking the
results.
 Identify problem conditions by setting parameters to generate alerts.

Network General Sniffer

Sniffer, which is part of a family of analyzers from Network General, can decode and interpret frames
from 14 protocols including AppleTalk, Windows NT, NetWare, SNA, TCP/IP, VINES, and X.25. Sniffer
measures network traffic in kilobytes per second, frames per second, or as a percentage of available
bandwidth. It will gather LAN traffic statistics, detect faults such as beaconing, and present this
information in a profile of the LAN. Sniffer can also identify bottlenecks by capturing frames between
computers and displaying the results.

Novell's LANalyzer

The LANalyzer software performs much the same function as Sniffer but is available only on a NetWare
LAN.

Monitoring and Troubleshooting Tools


After a network has been installed and is operational, the administrator needs to make sure it
performs effectively. To do this, the administrator will need to manage and keep track of every aspect
of the network's performance.

Network Management Overview

The scope of a network management program depends on:

 The size of the network.


 The size and capabilities of the network support staff.
 The organisation's network operating budget.
 The organisation's expectations of the network.

Small peer-to-peer networks consisting of 10 or fewer computers can be monitored visually by one
support person. However, a large network or WAN might need a dedicated staff and sophisticated
equipment to perform proper network monitoring.

46 | P a g e
One way to ensure that the network does not fail is to observe certain aspects of its day-to-day
behavior. By consistently monitoring the network, you will notice if any areas begin to show a decline
in performance.

Performance Monitors

Most current network operating systems include a monitoring utility that will help a network
administrator keep track of a network's server performance. These monitors can view operations in
both real time and recorded time for:

 Processors.
 Hard disks.
 Memory.
 Network utilization.
 The network as a whole.

These monitors can:

 Record the performance data.


 Send an alert to the network manager.
 Start another program that can adjust the system back into acceptable ranges.

When monitoring a network, it is important to establish a baseline." This documentation of the


network's normal operating values should be periodically updated as changes are made to the
network. The baseline information can help you identify and monitor dramatic and subtle changes in
your network's performance.

Network Monitors

Some servers include network monitoring software. Windows NT Server, for example, includes a
diagnostic tool called Network Monitor, shown in Figure 13.2. This tool gives the administrator the
ability to capture and analyze network data streams to and from the server. This data is used to
troubleshoot potential network problems.

The packets of data in the data stream consist of the following information:

 The source address of the computer that sent the message.


 The destination address of the computer that received the frame.
 Headers from each protocol used to send the frame.
 The data or a portion of the information being sent.

47 | P a g e
Figure 13.2 Windows NT Network Monitor

Simple Network Management Protocol (SNMP)

Network management software follows standards created by network equipment vendors. One of
these standards is the simple network management protocol (SNMP).

In an SNMP environment, illustrated in Figure 13.3, programs called "agents" are loaded onto each
managed device. The agents monitor network traffic in order to gather statistical data. This data is
stored in a management information base (MIB).

SNMP components include:

 Hubs.
 Servers.
 NICs.
 Routers and bridges.
 Other specialized network equipment.

48 | P a g e
Figure 13.3 SNMP environment showing components

To collect the information in a usable form, a management program console regularly polls these
agents and downloads the information from their MIBs. After the raw information has been collected,
the management program can perform two more tasks:

 Present the information in the form of graphs, maps, and charts


 Send the information to designated database programs to be analysed

If any of the data falls above or below thresholds set by the manager, the management program can
notify the administrator by means of alerts on the computer or by automatically dealing a pager
number. The support staff can then use the management console program to implement changes in
the network.

Activity 3

1. The _____________ ________________ is the primary all-purpose electronic measuring


tool used by computer and electronic technicians.
2. _________ - ______________ _______________________ send sonar-like pulses along
cables to locate breaks, shorts, or imperfections.
3. __________________ are electronic instruments that measure the amount of signal
voltage per unit of time and display the results on a monitor.
4. In a crossover cable, the send wire from one computer is connected to the ____________
port on the other computer.

49 | P a g e
Activity 3

5. Protocol analyzers, also called "network analyzers," perform __________ -__________


network traffic analysis using packet capture, decoding, and transmission data.
6. A ____________________ _________________ can help to establish a network’s
information baseline.
7. A network monitor allows the administrator to capture and analyze network
___________ _______________ to and from the server.

Obtain the required fault finding tools


Tools are indispensable as they support the fault finding process. Tools are critical in helping a
technician during information gathering, formulating a hypothesis, testing and resolution.

Software tools

Software tools are software programs run on a computer system for diagnosing, collecting information
and auditing purposes. Generally, these tools are used in conjunction with a cyclic fault-finding
method in order to gather and evaluate system information. Many of these tools simply interact with
the system to gain access to areas that usually are hidden from the user. It is important to say that
most of the time; such tools will not fix the fault, but will rather report on conditions and perhaps
indicate pointers to possible sources of conflict. It is up to you, the technician, to identify the best
course of action (i.e. devise an action plan).
Additionally, many software tools are actually bundled up with the operating system, for performing
operating-system-specific troubleshooting. The advantage of these tools is that they integrate well
with the OS, perhaps, yielding more accurate results than third party tools.

Auditing and reporting tools

Auditing tools are designed to gather information about system critical components. For example, a
typical auditing tool might report on areas such as security, system resources (RAM, CPU, IRQ etc.),
hardware peripherals, software installed including version and licence details, software drivers etc. It
is up to you to access this information, analyse it and decide on what it should be done about it.
Reporting tools are simply dedicated to extracting available information from system repositories such
as event logs and the Windows registry; and then presenting it to the user in a report format. A typical
example of this includes the program System Information tool available in most versions of Windows.
See figure below.

50 | P a g e
Windows System Information tool

Third party tools also can be very useful, especially if they are non-operating system dependant. The
Belarc Adviser is one such tool that can run on just about any version of windows (generally, you will
need a separate tool to audit across platforms ie Windows, Linux, Mac). Belarc Adviser will profile a
computer system and output its report as a HTML file, which can be viewed on any web browser, and
possibly be uploaded to a central location. This program is available for free for personal use at
www.belarc.com.

Manufacturer’s tools

Many of the larger manufacturers such as IBM, Dell, Hewlett Packard, make special diagnostic
software that is expressly designed for their systems. This software normally consists of a suite of tests
that thoroughly examines the system.
Disk manufacturers usually supply tools that are specific to their storage devices for tasks such
formatting, geometry translation, diagnostics etc.

Diagnostic tools

Software diagnostic tools are more comprehensive in that not only do they collect and report
information, but they can suggest a course of action to solve a problem.
Many of these tools are commercially available. One such tool is Norton System Works. For more
information regarding this product visit www.symantec.com - go to "Products" and look for "Norton
System Works".

51 | P a g e
Another diagnostic tool that you might find useful is Software’s SANDRA (System Analyser, Diagnostic
and Reporting Assistant). Take a look at the Software website (http://www.sisoftware.co.uk/) for
more information. In particular look at the first article "Who/What is SANDRA?"

Network tools

Network fault-finding is very challenging indeed. Nevertheless, there are many tools that can help us
in our fault-finding quest. Networks can be very complex, and even though the same methodology
may be applied to network fault-finding, a new approach is needed.
Many network engineers choose to modularise the network (break into modules or parts), so that
troubleshooting can concentrate on more specific areas rather that look at the network as one large
and complex system. The preferred method is to use the OSI (open systems interconnection) model.
If you have not heard of the OSI model, now is a good time to learn about it. A good place to look for
a concise explanation is www.webopedia.com.

Web search activity


Go online and use your preferred search engine to do a search on ‘The OSI Model’ and the
‘The 7 layers of the OSI model’. For a good overview, start with the Wikipedia listing for "OSI
model".
Network connectivity tools are among the most common yet effective tools for fault-finding
networks. Network connectivity refers to the ability of a network device to establish connections to
other network devices as part of normal network operations. Network connectivity tools test for data
connection and communication between network devices. Such tools include generic utilities such as
Ping (Packet Internetwork Groper) and tracer/traceroute and telnet. These tools are available under
ALL operating systems that support TCP/IP.
Ping and tracer/traceroute use the ICMP protocol and TCP’s reliability mechanisms to detect any faults
in connectivity on a TCP/IP network. Keep in mind that the great majority of networks, regardless of
the OS platform, are based around TCP/IP for connectivity. A typical test using Ping involves sending
a series of messages (known as pinging) and waiting for replies. If reply request are fulfilled, basic
connectivity is available. In the example below, we can see a PC pinging its ‘default gateway’.

52 | P a g e
Command prompt window showing PC pinging default gateway

The tracer (Windows)/traceroute (Unix) command is also useful to pinpoint the actual source of
trouble along a network route. Traces are achieved by pinging every host/router along a network path,
from source to destination. In the example below, we can see the path taken by data when delivered
from a system to www.cisco.com.

Command prompt window showing network trace - some IP addresses have been deleted for security.

Protocol analysers and packet sniffers constitute another useful alternative when troubleshooting
network faults. These tools allow you to capture network frames/packets for subsequent analysis and
decoding. A skilled engineer would be able to ‘see’ any problems on the network, simply by looking
for certain patterns in the network traffic such as errors, time-outs, excessive broadcasts etc. High
level skills and training are needed.
Ethereal is an open source packet sniffer/protocol analyser available for Windows as well as for Unix-
like operating systems. Take a look at the Ethereal website (www.ethereal.com) - in particular look at
the "Introduction" page for an overview and screenshots of Ethereal in action.
Keep in mind that packet capture and protocol analysis must be exercised with great care. All public
networks and service providers ban the use of such tools. Packet capture is perceived as a very hostile
activity, usually indicating hacking/cracking activity because of the information that may be gained
from captured data transmissions. However, network administrators and troubleshooting engineers
may use these tools legitimately in networks when trying to isolate problems.

Hardware Tools

Hardware tools do not just refer to screwdrivers and side-cutters. Hardware tools might include
devices such as POST cards, multimeters – to measure voltages, continuity etc., and cable testers – to
test cabling installations.

53 | P a g e
POST cards – are specialist electronic cards that plug in to a system’s main board. The purpose of these
cards is to perform functionality tests on a system, and report on failed devices/modules. Some POST
cards can interact with software, enabling sophisticated fault finding. POST cards are not as common
these days, since many failed main boards are simply replaced, without further investigation (mostly
due to economic reasons)
Multimeters – are very useful devices that enable you to measure at least voltage, resistance and
continuity (whether a circuit is open or shorted). Many technicians will carry a multimeter and use it
to measure the output from power supplies, the continuity of a cable etc.

A basic Multimeter

Cable testers – are common in many IT departments. Even though cabling is the domain of licensed
cabling installers, it is very useful to be able to check a segment for basic functionality. Most basic
electronic cable testers will test things such as Wire map (UTP – correct wiring), length, opens and
shorts.

54 | P a g e
A basic cable tester

Many other data cable testers are available in the market place. For information about such products
you might want to visit www.agilent.com or www.fluke.com. Both companies produce several
products for testing network media such as UTP and Fiber optic.

Hardware Toolkits

Toolkits are necessary and very useful to have available. Technicians will have different opinions in
terms on what should be part of a technician’s essential toolkit. This sections aims at introducing some
essential items that most people would recommend as basic tools.
A generic toolkit should contain at least the following:
 A medium size Phillips screwdriver
 A small size Phillips screwdriver
 A medium size flat screwdriver
 A small size flat screwdriver
 A pair of small side cutters
 A pair of nose-pliers
 A medium size Torx screwdriver
 A small hexagonal driver
 An anti-static wrist strap
 A multimeter
Other items that can be considered optional may include:
 Varying sizes hexagonal drivers
 IC extraction tool
 IC insertion tool
 Tweezers to pick up small parts
 Small plastic containers to hold loose screws, small parts
 Compressed air can to blow dust and small particles
 Simple cable tester

This reading has presented an overview of cyclic fault-finding methods. As you would have observed
fault-finding demands that you are able to apply logical and systematic procedures in order to arrive
to the solution of a problem. Fault-finding varies in approach but the scientific method, which
proposes to gather information, define problems, form hypotheses, test and draw conclusions, is
sound. Repetition is the key to fault-finding until a satisfactory outcome is achieved. Your ability to
apply the methods outlined in this reading will determine your success at fault-finding.

55 | P a g e
This reading has also presented an overview of hardware and software tools that you might use when
fault-finding. This is not meant to be an exhaustive list of tools but a sample of what is available.
Without doubt, you will use some of these tools in your IT career, and you will find other tools that
you prefer. Fault-finding tends to be ongoing process and with new technologies emerging all the
time, new tools will be developed to make fault-finding more effective and streamlined.

Identify legislation, health and safety requirements, codes, regulations and


standards related to the problem area4
Areas of computer health and safety

Most computer health and safety concerns involve bad ergonomics. Problems like repetitive strain
injury (RSI), aches, pains and numbness can be caused by poor-quality keyboards and mice or badly
adjusted chairs.

Some people find looking at monitors and displays for long periods can cause eye-strain or headaches
too.

However, there are other issues to be aware of. As with all electrical equipment, computers can be
dangerous when used incorrectly or if turned on with the case open. Trailing cables can also be
hazardous.

Installing and maintaining computer hardware can involve physical risks. For example, your staff
should be careful when running cabling above a suspended ceiling or placing network equipment into
confined spaces.

Computer health and safety: why it matters

Although suffering a serious injury while working with IT is unlikely, computer health and safety
problems can be persistent. Employees may be forced to take time off or prevented from using
computers at all.

Practicing good computer health and safety will ensure the well-being of everyone in your business
when they use IT. You should always aim to minimise risks, but there are specific legal requirements
too. For instance, if an employee works with display screen equipment, you are required to pay for
them to have regular eyesight tests if they ask you to.

Failing to take adequate computer health and safety precautions can signal to employees that you’re
not bothered about their welfare. And the legal obligations mean you could also be at risk of paying
out damages should any computer health and safety problems arise.

4
Source: Tech Donut, as at https://www.techdonut.co.uk/it/staff-and-it-training/computer-health-and-safety,
as on 10th December, 2016; ACS, as at http://www.acs.edu.au/download/samples/comservice.pdf, as on 10th
December, 2016.

56 | P a g e
IT risk assessment

Taking care of IT and computer health and safety should be part of your overall health and safety
strategy. Make sure someone has responsibility for it. Give them the power and budget to provide
adequately for your staff.

You cannot afford to be complacent. Every workplace can pose some risk including apparently low-
risk offices. Carry out a thorough IT risk assessment to understand where the dangers lie in your
company. Be systematic: start with a list of the most common IT problems and speak to your
employees to identify any you may have missed.

Once you’ve carried out your IT risk assessment, identify a solution to each issue you find. For instance,
if employees are using laptops for long periods of time, you should provide them with a keyboard,
mouse and monitor to make their work more comfortable.

Computer health and safety policies and training

You should have an IT policy explaining computer health and safety best practice. Set out what
equipment employees should have access to and how they can address issues. Again, someone in your
business should be responsible for this – make sure your staff know who.

Give all employees training in the basics of setting up and using computer equipment. It’s particularly
important everyone understands how to adjust their chair, keyboard, mouse and screen properly.

Encourage people to speak up if they're worried about computer health and safety. Minor physical
ailments tend to become major ones if not addressed early, so make it easy for people to raise
concerns.

TOOLS
1. The Basics
For basic computer servicing you need the following tools:
Basic hand tools
There are various 'computer tool kits' on the market, which are designed to provide the equipment
needed by a technician. They are primarily intended to provide equipment needed to assemble or
disassemble a computer and/or peripherals. They may or may not also include other equipment. Cost,
range of tools, and the quality of equipment can vary a great deal.

The most basic equipment may include:


• Socket spanners (2 or more different sizes)
• Standard screw drivers – You will require one small and one large flat bladed screwdriver. Most cases
are held together with Phillips screws with a slot across to accommodate flat bladed screwdrivers but
some are now being fitted with security screws that require special screwdrivers. Check before setting
off for a site visit.
• Phillips head screw drivers – Small Philips screwdrivers for power supplies, interface cards and hard
and floppy disk mountings will also be required.
• Long nosed pliers
• Wire cutter/stripper

57 | P a g e
• Chip extractor – Special types are available for processor and chip removal to help to prevent
accidental damage. The legs of a chip are particularly fragile.
• Chip inserter
• Tweezers – Extremely useful when small parts are dropped in inaccessible places.
• Parts grabbers (claw type)
• Torx drivers – These are used to remove the star shaped screw heads found on many Compaq
machines.
• Clamps
• A flashlight and magnifying glass – To look under motherboards and in dark parts of the PC case and
to make markings on the motherboard easier to view.
• A small plastic container – For keeping, screws, nuts and retaining straps etc.

Some more essential tools will be:


A Digital multi meter
This is used for testing power supply voltages and cable connectivity. Many trouble shooting
procedures require voltage and resistance to be tested. Values are measured using a hand held multi
meter. The meter may have an analog or a digital (LCD) read out and will use a pair of probes to
connect to the device being tested.
Cleaning equipment and materials E.g. Contact cleaning chemicals, compressed air, bristle brush, hand
vacuum cleaner

Wrap plugs

These are used to diagnose serial and parallel port problems

Software for testing/diagnosing components in a system

Diagnostics hardware for testing components in a system Anti-static wrist band, mats and anti-static
bags

Damage can be caused to circuit boards by static discharge therefore anti-static equipment is vital.
Spare PC components should always be stored in protective anti-static bags, such as those used by
manufacturers to supply interface cards.
2. Advanced requirements
The following equipment is more specialized, not required so frequently, but nevertheless useful:
• Specialized hand tools
• Pin grid array (PGA)
• Plastic leaded chip carrier (PLCC)
• Chip removal tools
• The importance of these tools cannot be overstated. If you try to pull out a processor chip without
one of these tools, you are going to damage some expensive equipment.
• Soldering tools
• Soldering iron

Logic probes and pulsers


These are used to analyse and test digital circuits.

58 | P a g e
Power supply testing equipment
Variable voltage transformers
Load testers
Memory testing machines

These may be used to evaluate the operation of computer chips, memory modules etc.

Oscilloscopes
These can be used to accurately display digital and analog signals, to analyse their purity and timing.

ESD kit
This is an electrostatic discharge protection kit.

GUIDELINES FOR USING BASIC TOOLS


Electrical safety
Voltages used for domestic power supplies vary between 110 and 240V, Sufficient to give a serious
electric shock. Display equipment such as computer monitors generate and store voltages of up to
30,000 volts (30KV). These voltages can be present even when the equipment has been equipment
has been switched off for some time.
• It is vital that basic electrical safety guidelines are followed at all times when working on electrical
equipment. In conjunction with any additional formal instructions, the following should always be
noted.
• Do not work with electrical equipment unless you know what you are doing are sure of the
consequences.
• Remove all the jewelry while working on electrical equipment.
• Beware of building up and static electricity or electromagnetic energy -insulate, be cautious...etc.
• Use extreme care when applying any of the above tools. In general most adjustments will not have
to be forced.
• Use the right tool for the right job, don't bend or damage parts
• Use chip extraction or insertion tools to handle chips, and be cautious not to bend any pins on a
chip.
• Always replace blown fuses with one of the correct rating and always check that the existing fuse
was rated correctly.
• Never work alone – there should be always someone nearby to assist in an emergency.

SOLDERING
There will be occasions when a soldering iron will be necessary to fix a broken wire or similar problem
on a circuit board. Not all boards are the same design wise, soldering on the motherboard should be
minimal and then only on components that can be pulled through. Never throw out old motherboards
as these will be ideal to practice upon.

In general, only those experienced in using a soldering iron should use this tool on a computer. The
actual soldering iron will need to be specially selected, no more than 25 watts as hotter irons will cause
damage to other components.

A solder sucker will allow more precision as well as quicker working speed. All it does is keep the area
being worked upon clean of hot solder when dismantling pieces.

59 | P a g e
USING AN ELECTRICIAN
It may be illegal to tamper with the electrical system of a building. It may however be that problems
in the electrical system may be the source of problems in a computer.

Be aware of how far you can go....legally!

The Computer workshop


When first establishing a computer workshop, generally one of the major restricting factors will be
cost. Therefore any workshop will need to be cost efficient, whether it is owned or leased. It should
be large enough for uncluttered working and storage area.

Most workshops also have at least a minimal display area for new or even 2nd hand product sales.
Many computer workshops only have a small front office area for customer pick up and some display.
While out the back is work space and storage. This type of workshop should be cheaper to lease than
one with a large showroom area.

Some initial research will need to be done to determine the amount of business that might be
expected. This will probably dictate size, location and cost of any workshop. Many computer
technicians work from home initially until they feel that they have a sufficient customer base to
warrant expansion into a larger site.

Main components of computer shop offering servicing & other facilities


• Counter for sales, security of cash, goods and staff, dealing with customers.
• Sales/service display.
• Workbenches.
• Adequate space and lighting and electricity (including work and storage areas).
• Customer waiting area (optional but very useful)
• Cash register/ credit card facilities.
• Telephone.
• Tools associated with computer repair.

ALSO USEFUL
• Computer and printer for business applications.
• Photocopying machine.
• Lunch and Tea-room facilities.

Workshop layout
The workshop layout must be practical, comfortable and within registered government health and
safety standards. One problem that computer workshops seem to be afflicted with is insufficient
storage space. Poor layout could result in a cluttered and inefficient workshop. Parts that are being
replaced but not discarded, repaired computers whose owners are not in a rush to pick up, new parts
that have been ordered can all contribute to this problem. Having good workshop layout as well as
good ordering and customer awareness data should alleviate this problem to some extent.

60 | P a g e
Ensure a workshop is safe.
• install guards on any dangerous equipment (e.g. exposed electricity)
• place grates over vents or other exposed holes (e.g. floor drains which may tripped you)
• install non-slip surfaces where necessary
• set procedures (e.g. only trained, competent and authorised staff allowed to use or repair machinery
or equipment)
• provide a 'kill switch' (instant shut-off) on dangerous equipment

Always assume electrical systems are 'alive '. Test and tag electrical systems.

TOOL MAINTENANCE
Looking after your tools is very important to safety! If you look after them then they are reliable. Tools
in good condition will perform in a predictable fashion. Some simple reminders are listed below:
1. Metal
Metal tools may corrode. To prevent rust or corrosion metal either needs painting with a good metal
primer, or regular coating with oil (After using, clean and wipe metal parts with an oily rag).
2. Sharpening
Some tools need to be sharp. Keeping your tools sharp usually means less effort is required to use
them, so less strain is applied, and you are less likely to slip.
3. Cleaning
If tools are kept clean they are less likely to corrode or have moving parts seize. This also reduces the
likelihood of microorganisms being carried on tools (and the chances of being infected if you cut
yourself). Wiping a knife blade with methylated spirits can be an effective way of destroying any
microorganisms.

4. Storing
Keeping your tools stored properly means they are less likely to be damaged, lost or stolen. They can
also be found more easily when required, saving time.

Tools left lying around can also be dangerous, particularly if you have young children, or they can be
used by burglars to break into your house, garage, sheds etc.

MANAGING THE WORKSHOP


Charging/pricing
The computer industry is an extremely cut throat business as far as pricing goes. Due to a never ending
of newer and updated technology, many computer components are quickly out of date and therefore
useless if not sold on during their heyday. State of the art components may only fall into that category
for as little as 1 - 3 months before they are superseded. It is pointless and bad business to have stock
that will not be used lying about the workshop. Order stock only when necessary.

Charging for computer repair is another area of the computer industry which needs to be approached
carefully. Often older machines will not warrant repair but could possibly be upgraded. Customers
should be made aware of their options when they approach you with a problem. All costs involved
should be explained and itemized.

61 | P a g e
This is simply good customer relations and conveys a sense of professionalism to your customers.
Remember computers are still somewhat mysterious and misunderstood machines by
large sections of the population.

RECORD KEEPING
Good record keeping is simply good business sense. All transactions, repairs undertaken and parts
ordered must be put on record. Apart from enabling financial planning for the future it also means
less headaches at tax time. It allows you to chart the progress of the business and to change those
areas that are lacking or unnecessary. If you do not feel comfortable with record keeping, then you
could either enroll in a short course or hire a professional to do this for you.

FINANCIAL RECORDS
It is extremely important to keep accurate, clear and accessible records of all financial transactions
which take place in a business. Different businesses have different types of book keeping systems. The
options are very great. A balance must be struck though where you decide between a system which
gives you the detail you require and one which doesn't take too much time to maintain.

Financial records are needed because:


They help you manage your finances You can make decisions about what something is likely to cost in
the future by seeing what it cost in the past.
They allow you to see whether your business is making a profit or loss.
They give you a basis upon which you can calculate what you will charge your customers.
They allow you to prepare and submit you tax returns.
They are legally required by government.

THE SIMPLEST APPROACH


Many small businesses do little more than keeping a record of money spent in their cheque book (and
all spending is deliberately channeled through the cheque book); and keeping a record of payments
received in their pay in bank book. At the end of each financial year, these records are given to their
accountant, who then prepares their taxation return and any other necessary financial records (such
as a balance sheet or profit and loss statement). There is nothing wrong with this approach, though
something better is generally desirable.

Work scheduling
It is important when beginning a business to give your customers the very best service that you
possibly can. New customers who have not dealt with you before will be watching for signs of
tardiness, inattention, sloppiness of work etc..., they will also be quick to relate to others excellence
of service and genuineness.

Word of Mouth advertising is priceless and should never be underestimated. It is for these reasons
that work scheduling is important. Work out approximately how much time standard repairs will take,
include diagnosis, dismantling, repair and reassembly.

DO NOT over book either yourself or your workers as this will result in mistakes, repairs that are
brought back, and time and money wasted.

62 | P a g e
The following notes apply to work scheduling in an organisation of considerable size with a number of
employees involved. The basic premises however can be applied to businesses of any size including a
small workshop operation.

Before planning can commence, you need to know:


• Details of all major programs which might affect decisions which may be made. (i.e.: budgets, costs,
resources etc.).
• Policies of the organisation. (Work schedules are confined by such policies, which might include not
working Sundays for instance.)
• Expectations from management. (What amount of work is expected from this section)
•Planning a work schedule involves a similar process to the problem solving technique:
Step 1. Define objectives, goals, tasks to be achieved.
Step 2. Put forward several alternative courses of action.
Step 3. Make a decision which of the alternative courses of action will give the best result.
Step 4. Put the chosen plan into action.

Collect data relevant to the system5


Fault finding methods
There are literally hundreds of hardware and software tools that have the ability to help diagnose and
solve problems. These tools are somewhat limited unless you can use them in conjunction with sound
fault-finding methods. This reading will introduce you to fault-finding methods.

The scientific method to fault-finding


Fault-finding is a skill required in just about every industry, not just in IT support. If you do a search on
the Internet looking for fault-finding information, you are likely to encounter massive amounts of
literature for many industries, not just IT. In general, those areas of industry which demands high-level
technical skills have developed well-documented fault-finding methods. It is not surprising though that
all of these methods have similar principles. Enter the scientific method to fault-finding.
The scientific method is not specific to any technology. The scientific method is an investigative
process that uses logic to test theories or hypotheses through observation and methodical
experimentation. In fact, the scientific method has been around since the beginning of time, when
people began to derive knowledge from the world around them.
The scientific method proposes to use logical and systematic steps (procedures), to analyse available
information, such as symptoms, in the hope to find information that is useful and relevant whilst
discarding what is not. This procedure will enable you to draw conclusions and hopefully arrive at the
source of the problem. Generally, the method is repeated (cyclic), until the source of the problem has
been identified.

5
Source: State of NSW, Department of Education and Training, as at
http://lrrpublic.cli.det.nsw.edu.au/lrrSecure/Sites/Web/6196/ICAT4221A/index.htm, as on 10th December,
2016.

63 | P a g e
The principles of the scientific method are summarised in the following steps:
1. Gather Information
2. State the Problem
3. Form a hypothesis
4. Test the hypothesis
5. Draw conclusions
6. Repeat when necessary
The next part of this reading will introduce fault-finding techniques, based on the scientific method.

Cyclic fault-finding
Cyclic fault-finding is the preferred method for problem determination used in the IT industry. The
myriad of hardware and software tools available for fault finding will help you gather useful
information, but generally, the tool won’t fix the problem for you. You will need to make your own
decisions in terms of what is the best course of action.
Generally, companies develop their own cyclic methods, or choose to adhere to someone else’s
method i.e. Cisco’s Troubleshooting guidelines (see the Cisco website www.cisco.com and search for
"troubleshooting"). The most important part of troubleshooting any problem is to divide the tasks of
problem resolution into a systematic process of elimination. In general, cyclic fault-finding involves
taking a series of steps, varying from 5 to 8 steps, and then repeating these steps until the problem is
solved.
It is important to note that cyclic method rely on technicians formulating a hypothesis (probable cause
– step 3), and then testing the hypothesis (steps 4 and 5). If the desired outcome if not achieved, the
process is repeated (new hypothesis). Take a look at the following steps introduced below:

64 | P a g e
Define Fault

Gather Details

Probable Cause

Create Action Plan

Implement Action
Plan

Observe Result

Solved?
No
Yes

Document

Figure 1: Steps in Cyclic Fault Finding

Strictly speaking, the above process only requires 7 steps for troubleshooting, but best-practice is to
update/create appropriate documentation – good quality documentation will only aid fault-finding in
the future!

Fault-finding decision trees


Quite often, IT support companies will develop fault analysis trees or fault finding decision trees. Fault
finding decision trees are aides for support people to use as guidelines when troubleshooting.
With time, you will be able to create your own decision trees to help the fault-finding process. The
following is an example of a decision tree aimed at helping someone troubleshoot a network access
fault (i.e. user cannot log in or access her email).

65 | P a g e
User cannot
access network
(e-mail)

Ping e-mail server No Yes Ping by hostname


DHCP in Use
by hostname in same subnet

Yes Yes
Success? Success?

No No

Ping e-mail server Ping e-mail server


by IP address by IP address

Yes Yes
Success? Success?

No No
Reconfigu Reboot Release /
re WINS / and Logon Renew
DNS again DHCP

E-mail Yes
accessible?

No
Reconfigure Yes
IP Settings First Time?
and Reboot

No

Investigate possible
hardware/cabling/drivers
faults etc

Problem
Solved
Move on to
another decision
tree...

Figure 2: Sample decision tree to help someone troubleshoot a network access fault

66 | P a g e
The steps in the tree above can be explained as follows:
The tree begins by defining the problem: The user cannot access e-mail. Then a question is asked
(diamond shape). Only two answers are possible: Yes or No. Depending on the response a process
takes place (rectangular shape), which leads on to another question. No matter how complicated, all
decision trees work in the same manner:
1. State Problems (Begin)
2. Ask Questions (Diamond Shape)
3. Analyse Response (Yes or No)
4. Take Action (Rectangular shape)
5. Ask more questions
6. Analyse response
7. Take action
8. And so on until problem is solved or a different course is deemed necessary
It is not unusual for software and hardware manufacturers to include such charts with their products,
as additional support information. The general idea is that people may, with the help of these trees,
perform first-level support, potentially cutting down on the number of support calls made to
companies. Effectively, vendor supplied charts become a form of fault finding tool.

About system problems

Critical problems

Not all problems have the same impact on a business. Some problems have low impact; which means
that the problem does not have the potential to disrupt business operations. Some problems have a
very high impact; which means that the problem has the potential to stop business operations, incur
revenue loss and possibly damage their reputation.
For example, if a part time worker cannot use her computer because the keyboard won’t work; this
would not be regarded as critical to the business. Surely, it is disruptive to this worker’s routine, but it
probably won’t stop business. If the CEO’s (Chief Executive Officer) keyboard stops functioning, this
problem is more significant given the stature of this person in the business. The criticality of the latter
problem is higher than in the former.
If instead of a problem with a keyboard or even a person’s desktop, we find ourselves facing a problem
with the file server; the criticality stakes are raised again. This is the time of problem that is likely to
bring a business to its knees. Business cannot continue when critical components, software or
hardware, go down or become unusable. When critical or indispensable components are struck by
problems, this is regarded as the most critical type of fault.

67 | P a g e
Classifying problems

Faults need to be classified according to their criticality – this is what sort of impact a specific problem
may have on business operations. The questions that will need to be answered may include:
 How critical is this problem?
 What is the impact on the overall operations of a business?
 Should the contingency and disaster recovery plan be enacted?
 Does the business have the expertise to deal with the problem and provide a satisfactory
solution?
Problems that are regarded as non-critical (low criticality), won’t represent a threat to the daily
operations of a business. Operations will continue with some level of disruption. This disruption may
affect a standalone system, a series of systems or an entire network. An example of a problem
regarded as non-critical would be an Internet server going down due to a hardware failure – this is
certainly non-routine, but assuming that the business does not use the Internet for their core business
operations, business operations may continue, but without Internet access.
Problems that are regarded as critical are certainly serious. These problems have the potential to
seriously impair the function of a business. These types of faults will generally require IT personnel to
enact a contingency and disaster recovery plan. Business that are not prepared for these types of
faults and that have not formulated a sound contingency and disaster recovery strategy will suffer
serious consequences, including a total halt of business operations and loss of revenue. An example
of this type of fault would be an inaccessible database server holding inventory, ordering and sales
data, without which business cannot proceed.
Quite often, IT support managers and supervisors are responsible for assessing the criticality of faults.
Many companies have different scales for representing criticality. The following is a suggestion of how
this could be implemented:

Table 1: Sample scale for representing criticality of faults

Criticality Level or Risk Definition Disaster Recovery


1 High potential impact to large Enact Disaster Recovery Plan
number of users
It involves network/system
down time
2 High potential impact to large May require enacting
number of users or business Disaster Recovery Plan
critical service.
May result in some down
time
3 Medium potential impact to Disaster Recovery Plan
smaller number of users or enactment not warranted.
business service Remedial action required.

68 | P a g e
Resolution may require some
down time.
4 Lower potential service or Disaster Recovery Plan
user impact. enactment not warranted.
Change may require some Remedial action required.
down time.
5 No user or service impact. Disaster Recovery Plan
No down time. enactment not warranted.
Remedial action optional.

Hardware faults

Apart from faults being classified as critical and non-critical, you will need to use other classifications
in order to aid the troubleshooting process. One of the typical classifications of faults is whether the
source of the fault is a hardware device or component, or whether the source of the fault is found on
software – system or application.
Hardware faults are reasonably easy to troubleshoot, as the symptoms of the fault are fairly obvious.
For example, if the power supply unit of a computer fails, the computer will not power up. Sometimes
though, hardware faults can be difficult if the fault and symptoms only appear intermittently – that is,
the fault is not present all the times. For example, some hardware components only develop faults
under certain conditions, such as when the temperature of the device reaches a certain threshold.
Hardware faults sometimes can be rectified fairly quickly, by replacing the failed component. Usually,
technicians will have common Field-Replaceable-Units (FRU) available. FRUs are simply common
components that can be replaced on the field with reasonable ease. Examples of FRU may include:
 Hard Disk Drives
 Floppy Disk Drives
 Optical Drives (CD, CDR, DVD etc.)
 Memory (RAM)
 Sound Cards,
 Video Cards,
 Keyboard & Mouse
 Network Interface Cards
 Network Patch Leads

Software faults

As you might have guessed, software faults are those faults that are caused by a software component.
The software component may be part of the system’s software or may be applications software.
Software faults sometimes can be tricky to troubleshoot. Even though the source of the problem is
found to be software, not always it is crystal clear which software component is actually causing the
fault.

69 | P a g e
System Software Faults – are those faults that are caused by system software. Generally speaking, the
operating system is regarded as system software. However, some application software might also
install some system components it needs to run, which could become [and quite frequently are] the
source of faults. The source of software faults can be caused by:
 Software components corruption
 System incorrect configuration
 Documented and undocumented bugs
 Compatibility issues (hardware and software)
System software faults can have system-wide implications, which might hinder the operations of the
whole system.
Application Software Faults – these types of faults are rooted in application software components.
Generally, these types of faults only affect the application software in question – the rest of the system
operates normally. Similar to system software faults the source of these faults can be tracked down
to one or more of the following reasons:
 Software components corruption
 Application incorrect configuration
 Documented and undocumented bugs
 Compatibility issues (hardware and software)

Security-related faults
These faults are faults that develop in systems, and might have their source in hardware, software,
configuration or design.
More often than not, security related faults are the consequence of:
 Other faults (for instance, a hardware fault with a firewall device might expose systems that
would normally be protected by the firewall device)
 improper configuration,
 un-patched software bugs
 system design flaws
 undiscovered security holes/backdoors
Generally, the occurrence of any of the above issues, will result in security being compromised,
possibly exposing confidential and private information. Generally, to rectify this type of fault requires
engaging personnel with expertise in the area.
Security faults are sometimes referred to as ‘exploits’ since, the security fault does not in itself
represent a real threat unless someone malicious discovers and chooses to exploit the fault. It is
imperative that proactive action be taken to minimise the effect of security compromises.

70 | P a g e
Boot time faults

Boot time faults are faults that occur during the start-up sequence of a computer system. Boot time
faults are critical in that they can potentially halt the boot sequence possibly halting the system
altogether, rendering it unusable.
Boot time faults can have their source in software – usually due to improper configuration, missing
system files or incompatibilities (usually after new software has been deployed), or hardware – usually
due to boot device (typically hard disk drive) failure, or other major component failure such as RAM,
Video etc. Failed hardware peripherals might have an impact on booting up, but not necessarily halt
the system or make it unbootable.

Documenting system problems and symptoms

Your success as a fault-finding technician greatly depends on your ability to accurately document a
fault and its symptoms. You might be surprised to know that a great part of the answer lies within the
documentation of the problems and its symptoms. Hence, it is vital that documentation is not
overlooked and appropriate standards are observed.

Documentation standards
Many businesses choose to set-up their own standards for documenting IT systems, help-desk
procedures, and change management. Businesses tend to be very diverse in size, complexity and IT
infrastructure; hence, their documentation requirements may vary. One thing is important though,
good quality documentation is good practice and a must-have.
There are some standards in Australia that govern how technical documentation should be done. In
Australia, a body called Standards Australia is responsible for developing and promulgating standards
for all industries, including Information Technology. Standards Australia has published standards
which address systems and technical documentation, and provide a high quality guideline for IT
professionals to follow and implement. Many of these standards are identical [or adaptations] of the
ISO (International Standards Organisation) standards. The standards may be accessed online at
www.standards.com.au, requiring a paid subscription. Many libraries have copies of these standards
or can offer you the ability to access copies of the standards online from within the library.
Worldwide, the ITIL (Information Technology Infrastructure Library) has emerged as a de-facto
standard for many areas of IT. ITIL is a set of best practices standards for Information Technology
service management. They too provide standards (and actual templates) for maintaining
documentation. ITIL is controlled by the Office of Government Commerce (OGC) in the United
Kingdom. For more information about ITIL you may want to visit them at www.itil.org.uk

Using software to document problems


In many instances, companies choose to maintain their technical and fault-finding information
documentation electronically, using off-the-shelf or custom built software. There are many software
packages that offer the ability to store and manage documentation. Sometimes, these systems are
referred to as knowledge management/change management systems.

71 | P a g e
Not every organisation, particularly small ones, needs to deploy a knowledge management system.
Sometimes it suffices with a small application that can be used as a general purpose help desk
repository and job tracking system. The reasons why it is important to keep documentation such as
inventory, configuration and change management, job tracking etc. are varied; however, the most
important reason is to provide reference information for the analysis of complex system problems.
Good quality data can prove invaluable when troubleshooting problems down the track. Think of
knowledge bases – they can supply you with invaluable hints, workarounds, fixes and ideas. Even if
the same problem has not exactly been encountered before; you are likely to get ideas on problems
that you hadn’t thought of.
There are many software packages that can do the job for a small to medium sized enterprise (SME).
One such example is Track-It! from Intuit. Track-It! can help SMEs to not only maintain technical
documentation, but assist in the running of a small help desk operation. Note that Track-It! is a
Windows based solution.
Go to the Track-IT! website (http://www.itsolutions.intuit.com/) and select "Tour Track-It!" for an
excellent short overview of this software. You may also download an evaluation version of Track-It!
Note that free registration is required.
An example of an enterprise solution that allows medium to large enterprises to problem-manage and
maintain associated documentation is BMC Remedy IT Service Management. Remedy is used by many
businesses around the globe for managing all aspects of support, change management, problem
management and documentation. Remedy is supported on most operating systems platforms. For
more information on Remedy you can go to the website (www.bmc.com/remedy).
Systems and Technical documentation is frequently associated with knowledge management. There
many products that can take care of this aspect of IT management. One such product is Microsoft’s
SharePoint. SharePoint is a Microsoft proprietary solution that specialises in the management of
documentation and knowledge bases, providing a centralised repository for documentation.
SharePoint is set-up on a Windows 2000 or Windows 2003 server platform. For more information on
SharePoint you can go to the website www.microsoft.com.

Using troubleshooting tools to gather information

The fault-finding process is a constant process for gathering data (or feedback) and making decisions
based on this data. Hence, it is critical to understand how to use tools to gather data, or access data
which may already be available from logs, data trails, databases etc.

Using utilities

There is a range of utilities which produce an output or feedback that are commonly used for
troubleshooting purposes. Many fault-finding tools, particularly command-line utilities, do not
generally produce exhaustive reports for analyse. Instead they produce a small message to confirm
the success [or otherwise] of an action.
For example the ping command simply sends a series of requests to a network destination and reports
whether the requests were successful or not. It may report on data such as response-time (how long
it took our request to be answered), but the tool will not produce an exhaustive report. Please analyse
the following sample of using the Ping tool from the command line.

72 | P a g e
Figure 4: Screen shot of use of Ping tool

The following is another example of a utility that performs a task and generates incidental information,
which although not comprehensive, is certainly very useful. The utility presented here is the ‘Format’
command.

Figure 5: Screen shot of use of Format command

Debug/auditing features
Many computer operating systems and applications feature a ‘Debugging’ facility. If you are not
familiar with debugging, it is a concept borrowed from programmers aimed at getting the systems to
produce as much information as possible, in a stepped-through way, with the aim of capturing lots of
information which might help get rid of ‘bugs’ or design/flaws. The concept of debugging has filtered
through many areas of IT and today many systems feature debugging as a standard troubleshooting
tool which can be turned on as needed.

73 | P a g e
Additionally, many network hardware components such as high-end switches and routers, feature
debugging. They are able to debug since many managed network devices run an actual operating
system. For example, Cisco routers and switches run an operating system called the IOS which, allows
full debugging support.
Debugging usually is a fairly intense process which generates lots of information (sometimes more
than one can handle!), putting additional processing demands on the systems; therefore, debugging
is usually only enabled when troubleshooting is required. Data/Information produced by debugging
can either be output to the screen (which means you have to be in front the actual screen), or more
commonly redirected to a database system, where data can be stored for later viewing and analysis.
Another feature that can be useful for gathering and collecting useful system information is auditing.
Auditing is usually is not turn on by default, because as in with debugging, the amount of information
being generated is great; hence, auditing is usually only enabled for specific data gathering
requirements - i.e. when a technician is trying to get to the bottom of an elusive fault.
Windows 2000/XP/2003 allow you enable auditing by turning on a policy (usually called a Group Policy
Object (GPO). A GPO gives you control of certain aspects of the system which you may audit. The
following image is a sample of a GPO in Windows XP.

Figure 6: Screen shot of GPO in Windows XP

The resulting event audit information can be viewed in an event log, accessible via the Windows Event
Viewer. See example below.

74 | P a g e
Figure 7: Screen shot of Windows Event Viewer

Using diagnostic and troubleshooting tools to gather data


Usually diagnostic and troubleshooting tools are particularly effective in generating oodles of
information. Most diagnostic tools are capable of generating reports that can be used as a way of
documenting the configuration of a system.
Third party tools also can be very useful, especially if they are non-operating system dependant. The
Belarc Adviser is one such tool that can run on just about any version of windows (generally, you will
need a separate tool to audit across platforms i.e. Windows, Linux, Mac). Belarc Adviser will profile a
computer system and output its report as a HTML file, which can be viewed on any web browser, and
possibly be uploaded to a central location. This program is available for free for personal use at
www.belarc.com.

75 | P a g e
Activity 4

This activity will require you to think about a series of faults and classify them, according to their
nature. Your task is to complete the chart below

Table: Fault classification


Fault Description Hardware/Software Criticality (1
to 5)
Inaccessible boot record in Active Partition Software, unless actual
hardware drive failure
Blue screen of Death (BSOD), when launching
ABC application in Windows 200X
Modem will not dial up ISP
Browser returns a ‘Host not found error’ when
attending to browse Internet
User cannot access e-mail
User unable to login to network
Intermittent link problem returned by Network
Interface Card software
Computer automatically reboots for no apparent
reason when room temperature above 35oC
Software Application reports ‘Out of Virtual
Memory’ when attempting to run in Windows
Server will not power on. No POST. No lights
visible or spinning fans, or any indication of
power through system

76 | P a g e
Analyse the data to determine if there is a problem and the nature of the
problem6
Troubleshooting computers can be a little frustrating and a little tricky. With so many parts and
software installed, any number of things can go wrong. But when (not if) something happens, this is
the best opportunity for you to learn-of course provided that you have a few basics under your belt.
Nothing beats experience. The more you do it, the better you become, and the more your confidence
grows. And the best part, you will save yourself a lot of money.

There are many things that can go wrong with a computer. Here, I try to cover the basics to get you
going in the right direction.

Well let's start with an important tip: When troubleshooting computers always start with the simple
stuff. By that I mean there's a tendency to assume that when something happens it's always due to a
major problem, when all it could be is a loose cable or something else minor. I have been guilty of this
myself. Check the easy things first!!!

Now the real challenge is deciding whether a symptom is hardware or software related. A lot of times
this comes through trial and error. Don't be afraid of misdiagnosing a problem. It's going to happen.
Just keep at it.

Issues During POST:

When you power on your system, the power supply sends a signal to the CPU, which receives
instructions to go to the BIOS to start the boot process. Part of this process is the POST (Power On
Self-Test). Problems arising at this stage are almost always hardware. During the POST, devices are
found and checked for errors. If everything is fine the motherboard speaker will usually sound a single,
short beep and move on to loading the operating system. If something occurs you will hear some type
of beep or see an error message on the screen. BIOS manufacturers have different beep codes so you
will have to know which BIOS your system is using. Phoenix and AMI are the two primary makers.
Award BIOS was bought out by Phoenix in 1998. You can find the type of BIOS you have by either
turning on your computer (assuming of course it comes on) and looking at the top left of the screen,
opening the case and looking at the BIOS chip, consulting the motherboard manufacturer or the
company that built your computer.

6
Source: Learning About Computers, as at http://www.learning-about-
computers.com/tutorials/troubleshooting.shtml, as on 9th December, 2016.

77 | P a g e
Whichever BIOS you have, if the beep code indicates a memory or video card problem the usual
solution is to check to see if they are fully seated in their slots or to replace the part. If using built-in
video then it could be the motherboard. If it's a CPU beep code your processor might be overheating.
Some BIOS setups are set to shut the computer down if the processor is too hot. A malfunctioning
processor fan can could be the culprit. Turn off the computer and remove the case door. Turn the
computer back on and see if the fan is working or running slowly. If it's the fan, replace it. If not,
remove the processor and see if there's any physical damage to it. Keep in mind that you will not
always see physical damage on a bad CPU.

If you don't hear a beep at all, more than likely it's a failing power supply or motherboard.

Devices Not Listed in BIOS:

Immediately after the POST is performed information about your computer is listed on the screen,
including your drives. If you don't see a drive listed, go back and make sure they are installed properly
and that cables are firmly connected.

No Operating System Found or Similar Message:

After the POST and listed information the BIOS checks the boot device for the master boot record
(MBR), which tells where the operating system (OS) is. A drive set to boot with no operating system
will produce an error, so make sure your system is set to boot from the right device.

78 | P a g e
Go into CMOS and look under the BOOT menu to see if the proper boot order is listed. (Again,
depending on the BIOS, there are various ways to enter CMOS. It's listed at the bottom of the screen
soon after you turn on the computer. Most of the time it's by pressing DEL, F1, or F2). In many cases
the DVD drive is first on the list followed by the hard drive(s). That's OK. If the DVD drive is empty, the
BIOS skips it and starts looking at the hard drives. If there is a non-bootable DVD in the drive, remove
it. Your boot drive should be the first option or second (If DVD drive is first). Once found, the OS begins
to load.

Another cause for this message is that the master boot record itself can become corrupted.

Computer is Slow:

A computer that runs at a snail's pace is quite annoying, especially when you have a lot of work to get
done. Fortunately, many of the common causes are easily fixable.

A slow running computer is often due to viruses and spyware which are discussed below. Another
cause can be programs running in the background. Many times when installing new software, by
default they're designed to run when Windows starts. You can look in the tray at the bottom right of
the screen to see all the installed software that's running. You can usually stop these from starting
with Windows by either right-clicking on the program's icon in the tray and select its properties or
options and choose not to have it begin at start-up. Or open the entire program and go to the
options/properties menu.

Another way to prevent programs from running at startup is to run msconfig.

To open msconfig in XP click start, run, type msconfig. In Vista click start, type msconfig in the "star
search" text box right above the task bar (the program icon should appear in the white area above the
text box), then either double click the icon or press enter. Go to the start-up tab. There you will see
the same programs that are in your tray. You have the choice of disabling them all (not wise, there is
certain software that needs to run when Windows starts such as anti-virus) or individually selecting

79 | P a g e
the ones you don't want to start by unchecking the box next to them. After making your selection(s)
click apply. Your choices will go into effect the next time you start your computer.

Another common reason for a slow computer is not having enough RAM. Installing more can often
help the problem.

Viruses/Spyware:
Viruses and spyware can not only slow down your computer, they can render it unusable.
Furthermore, certain types of viruses and spyware can transmit your personal information to the
attackers. You should always have antivirus running on your system. If you are looking for a good free
option, I recommend Avast.

Limited Hard Drive Space:

After a long period of time, most of our hard drives contain data we no longer need or that is left over
by software not completely uninstalled eventually leading to a messy drive. Given the size of modern
hard drives, this is rarely an issue anymore. In any event, if you are a clean freak like me, you may
want to periodically clean house. Windows built-in Disk Cleaner tool is a good way to get rid of
unwanted files, although there's plenty of other software available too. And of course, you can always
add an additional hard drive if you need more storage space.

80 | P a g e
To open Disk Clean-up in XP or Vista click start -> Programs -> Accessories -> System Tools -> Disk
Clean-up and follow the instructions.

Fragmented Hard Drive:

When a hard drive is brand new and you begin installing software or saving data, Windows tries to
keep all the individual files intact, resulting in them being read extremely fast. But after a while you
start deleting things. Well, each time something is deleted, it leaves "gaps" in your drive. Then when
another program is installed or data saved, individual files are broken up and placed in these gaps all
over the drive. This is what is known as a fragmented hard drive. When opening a file or program, the
operating system has to scan the entire drive to find parts of files and put them back together,
reducing read time. This why it can seem like forever for a file to open.

Defragmenting a hard drive is easy with Windows Disk Defragmenter. It scans your drive for split up
files and reassembles them. To open In XP or Vista click -> Start -> Accessories -> System Tools -> Disk
Defragmenter. Before using Disk Defragmenter, I would suggest running Disk Clean-up first to
eliminate unwanted data. As with Disk Clean-up, there are many other 3rd party defragmenting
programs available.

81 | P a g e
Non-Working Devices/Device Not Recognized:

If a device has stopped functioning or isn't recognized by Windows, remember to first check the simple
things. Make sure cables and power are plugged in. With an internal component, turn off and unplug
the machine. Remove the case door and make sure cables are firmly connected to the device and that
add-on cards are seated in their slots. If all is OK, there may be a device driver issue. Device drivers
are little pieces of software that allow hardware to work. Reinstall the device driver or download the
latest version. Either go the manufacturer of the device or the company where you bought your
computer. If still no success try uninstalling and reinstalling the device.

If the above doesn't produce any results, it is probably the device itself.

Problems After Installing New Software or Device Driver:

Of course you should first uninstall the software or driver. Or use System Restore to return your system
to a previous working state. To open System, Restore in XP or Vista click Start -> Programs ->
Accessories -> System Tools -> System Restore.

There are times when new programs might freeze up your system. In this case try to see if you can
boot to Safe Mode and then perform a restore. Safe Mode only loads the very basic devices and drivers
needed for your system. To get to Safe Mode restart your system. When it begins to boot, continuously
press the F8 key. A menu should appear that looks similar to the one on the left.

82 | P a g e
Choose Safe Mode and press enter. After Windows loads you should get the screen on the right with
a black desktop. Start System Restore like described above.

No Power:

The main culprit is usually the power supply unit (PSU). Make sure the power cord is securely plugged
into the supply and the wall outlet. If so, you can buy a tester to see whether your PSU is putting out
enough voltage.

Another cause could be a malfunctioning device. Turn off the computer and disconnect all devices.
Reinstall each device one by one, turning on the computer after each device. Should your system not
come on after installing a particular component, replace it.

If your system doesn't come on after reinstalling every device, you may have a motherboard or CPU
problem.

Spontaneous Reboots:

A computer that reboots often (while you're in Windows or other operating system) is another
indication of a bad power supply. See the first couple of sentences under No Power above.

Time Keeps Changing:

If you constantly have to set the time/date clock, that's the main symptom of a bad CMOS battery.
Replace it. But just like any other battery it has to be the same size. Look at the number on your battery
and buy one with the same number.

83 | P a g e
Determine specific symptoms of hardware, operating system and printer
problems

Simple solutions to common problems

Most of the time, problems can be fixed using simple troubleshooting techniques, like closing and
reopening the program. It's important to try these simple solutions before resorting to more extreme
measures. If the problem still isn't fixed, you can try other troubleshooting techniques.

Problem: Power button will not start computer

 Solution 1: If your computer does not start, begin by checking the power cord to confirm that
it is plugged securely into the back of the computer case and the power outlet.
 Solution 2: If it is plugged into an outlet, make sure it is a working outlet. To check your outlet,
you can plug in another electrical device, such as a lamp.
 Solution 3: If the computer is plugged in to a surge protector, verify that it is turned on. You
may have to reset the surge protector by turning it off and then back on. You can also plug a
lamp or other device into the surge protector to verify that it's working correctly.

 Solution 4: If you are using a laptop, the battery may not be charged. Plug the AC adapter
into the wall, then try to turn on the laptop. If it still doesn't start up, you may need to wait a
few minutes and try again.

Problem: An application is running slowly

 Solution 1: Close and reopen the application.


 Solution 2: Update the application. To do this, click the Help menu and look for an option to
check for Updates. If you don't find this option, another idea is to run an online search for
application updates.

84 | P a g e
Problem: An application is frozen

Sometimes an application may become stuck, or frozen. When this happens, you won't be able to
close the window or click any buttons within the application.

 Solution 1: Force quit the application. On a PC, you can press (and hold) Ctrl+Alt+Delete (the
Control, Alt, and Delete keys) on your keyboard to open the Task Manager. On a Mac, press
and hold Command+Option+Esc. You can then select the unresponsive application and click
End task (or Force Quit on a Mac) to close it.

 Solution 2: Restart the computer. If you are unable to force quit an application, restarting
your computer will close all open apps.

85 | P a g e
Problem: All programs on the computer run slowly

 Solution 1: Run a virus scanner. You may have malware running in the background that is
slowing things down.
 Solution 2: Your computer may be running out of hard drive space. Try deleting any files or
programs you don't need.
 Solution 3: If you're using a PC, you can run Disk Defragmenter.

Problem: The computer is frozen

Sometimes your computer may become completely unresponsive, or frozen. When this happens, you
won't be able to click anywhere on the screen, open or close applications, or access shut-down
options.

 Solution 1 (Windows only): Restart Windows Explorer. To do this, press and hold
Ctrl+Alt+Delete on your keyboard to open the Task Manager. Next, locate and select
Windows Explorer from the Processes tab and click Restart. You may need to click More
Details at the bottom of the window to see the Processes tab.

86 | P a g e
 Solution 2 (Mac only): Restart Finder. To do this, press and hold Command+Option+Esc on
your keyboard to open the Force Quit Applications dialog box. Next, locate and select Finder,
then click Relaunch.

 Solution 3: Press and hold the Power button. The Power button is usually located on the front
or side of the computer, typically indicated by the power symbol. Press and hold the Power
button for 5 to 10 seconds to force the computer to shut down.
 Solution 4: If the computer still won't shut down, you can unplug the power cable from the
electrical outlet. If you're using a laptop, you may be able to remove the battery to force the
computer to turn off. Note: This solution should be your last resort after trying the other
suggestions above.

Problem: The mouse or keyboard has stopped working

 Solution 1: If you're using a wired mouse or keyboard, make sure it's correctly plugged into
the computer.
 Solution 2: If you're using a wireless mouse or keyboard, make sure it's turned on and that its
batteries are charged.

Problem: The sound isn't working

 Solution 1: Check the volume level. Click the audio button in the top-right or bottom-right
corner of the screen to make sure the sound is turned on and that the volume is up.
 Solution 2: Check the audio player controls. Many audio and video players will have their own
separate audio controls. Make sure the sound is turned on and that the volume is turned up
in the player.

87 | P a g e
 Solution 3: Check the cables. Make sure external speakers are plugged in, turned on, and
connected to the correct audio port or a USB port. If your computer has color-coded ports,
the audio output port will usually be green.
 Solution 4: Connect headphones to the computer to find out if you can hear sound through
the headphones.

Problem: The screen is blank

 Solution 1: The computer may be in Sleep mode. Click the mouse or press any key on the
keyboard to wake it.
 Solution 2: Make sure the monitor is plugged in and turned on.
 Solution 3: Make sure the computer is plugged in and turned on.
 Solution 4: If you're using a desktop, make sure the monitor cable is properly connected to
the computer tower and the monitor.

Known-good parts: the troubleshooting silver bullet7

Before we get into specific symptoms and fixes, there's one silver bullet that's guaranteed to get you
past most of the support person’s troubleshooting script and right to what you want: the known-good
part. That is, a power adapter, stick of memory, hard drive, or other component that has been plugged
into another system and is known to be working properly.

Let’s say you’ve got a laptop that won’t power on. If you switch its power adapter for one that is known
to work (or if you use its power adapter with a laptop that will power on and charge), you can say with
a fair degree of certainty that the power adapter is not the problem. This method does require you to
have working spare parts available for testing. But if you tell a phone tech that you’ve tested a
particular problem with known good parts, you’ll automatically skip through a lot of the script—and
quite possibly to the end of the conversation.

PC troubleshooting

PCs are always getting simpler and more streamlined, but there are still a lot of different parts to most
of them, which means that there is a lot more that can go wrong with them. We’ll go through potential
problems component by component, matching symptoms to issues and telling you the best way to
inform your friend on the other end of the phone. Pay attention here, because many of these
symptoms and procedures are also going to be useful when troubleshooting Macs, phones, and
tablets.

7
Source: ARS Technica, as at http://arstechnica.com/information-technology/2012/06/the-technologists-
guide-to-troubleshooting-your-own-hardware/2/, as on 9th December, 2016.

88 | P a g e
Some computer manufacturers may ship (or make available for download) special diagnostic tools
intended to detect problems with particular components. It's not always necessary to use these tools
to diagnose problems, but getting support will often be easier if you have the error messages and
codes generated by their tools. Having these error codes handy is the ultimate phone support
shortcut, and if you open with them, you’ll almost always skip straight to the part where they set up
the dispatch for you.

Power problems

Symptoms: Computer won't power on, battery won't charge.

If the computer simply isn't responding to any attempts to turn it on, you may be having power
problems. Remember that there's a difference between not powering on and not booting—a
computer with power problems won't light up or make any noises when the power button is pressed.
If lights and fans are coming on but the operating system won't load, you may have a memory, hard
drive, or even motherboard error instead.

As a first step, unplug the computer from power and remove any batteries, then press and hold the
power button for 10 to 15 seconds. This will completely power cycle the computer, draining out any
electricity that may be left lingering in its circuits (some desktop motherboards have a light on the
motherboard that will stay on for a while after the computer has been unplugged—once this light
goes out, you've discharged all of the power). If you plug the computer back in and still have no luck,
it's time to start troubleshooting the different stages of the journey between the wall and the
computer:

Start with the surge protector. Does the computer behave the same way if connected directly to the
wall, or to another outlet that is known to be working normally?

Look at the power brick if you've got a laptop. Most power bricks have two cords: one that runs from
the outlet to the brick, and one that runs from the brick to the computer. If either of these cords can
be detached from the brick, try again with a known good cord if you have one. If you've got a desktop,
you'll usually just have one cable to check, the one that goes from the outlet to the back of the
computer. If your laptop’s cables and adapters are working normally, you’ve probably got a
motherboard problem, and it’s time to call support.

If you’ve got a desktop, your problem could be either with the motherboard or with the system’s
internal power supply. Again, a known-good power supply will tell you exactly which is the problem,
but be sure to check for things like the aforementioned motherboard status light—if it lights up when
the computer is plugged in, it may point to a motherboard issue rather than a power issue.

If your computer will turn on but your battery won’t charge, you’ve almost certainly got a bad battery.
As always, try a known good battery in the computer (and, if you can, try the suspect battery in a
laptop that is known to charge) and make sure it’s not an issue with the contacts in the computer.

89 | P a g e
If you do have a bad battery, it likely isn’t covered under warranty unless it failed prematurely. If the
battery is less than a year old, you may be able to get a replacement. But if the battery is over a year
old, any loss of capacity or breakage will generally be seen as “normal wear and tear” and you’ll have
to buy a new one. Most laptop manufacturers will insist you buy a first-party battery to avoid voiding
the warranty on the rest of the computer.

Memory

Symptoms: Blue screens or crashing applications, computer powers on but will not boot, other erratic
behaviors.

MS-DOS chic: MemTest86+ is the mother of all memory testers.

Memory errors can be hard to diagnose since they're often intermittent, but they present most often
as general system instability: individual applications or the entire operating system may crash, the
system may sometimes refuse to boot. And you may even experience graphics corruption, since the
integrated graphics processors used by many computers today use the same memory as the rest of
the system.

For more serious memory errors, the computer may power on and beep or make the power light flash
a certain number of times without attempting to boot from the hard drive. Consult your computer's
manual to see if these correspond with any known error codes.

Your first step in troubleshooting this problem is going to be a good memory diagnostic tool. Some
computers will have a memory test tool built into the BIOS; the vendor’s support center will typically
ask for error codes generated by those tools when replacing memory.

90 | P a g e
But there are some good general-purpose alternatives if your computer shipped without one of these
tools. Windows 7 comes with its own memory diagnostic, which can be run from within Windows 7 or
from the Windows 7 install media.

My personal favorite memory test tool is Memtest86+. To run Memtest, you’ll need to download its
disk image, burn it to a CD, and then boot your computer off the CD. The tool will automatically start
testing your memory and will keep making additional passes until you shut the computer off.
Generally, if the tool hasn't found an error after two or three passes, it's not going to.

Once you’ve verified that you’re dealing with memory errors (and assuming that your computer has
multiple memory modules installed, as almost all of them do these days), take each module out and
test it individually. This can help you isolate the issue to one of the RAM modules. Once you’ve got a
module that you know is good, be sure to test it in all of the slots as well—this will reaffirm that the
problem is with the memory stick and not with one or more of the RAM slots on the motherboard.

Hard drive

Symptoms: Slow or inconsistent performance, errors when attempting to access files, computer unable
to boot, louder-than-usual drive clicking or activity noises (for mechanical HDDs only).

91 | P a g e
Scheduling a disk check with Chkdsk, Windows' built-in disk checking tool.

Losing the hard drive in a computer is one the most devastating failures you can experience, since the
data is often the most valuable part of the computer. Even if all of the other components fail, the drive
can still be pulled and the data transferred. But data recovery services for failed hard drives can cost
thousands of dollars, and they aren’t foolproof. If you don’t have a good backup system in place (and
you should: there are plenty of products that do it), you should be checking your drive for errors
regularly—detecting a failure early is the best way to prevent data loss.

As with RAM, some manufacturers (particularly business-class systems from the likes of Dell or HP)
include their own diagnostic tools with their computers, either in the BIOS or on a disc—if they do,
they’ll prefer data gathered with those tools to data gathered by others. Even so, you can generally
convince them that you’re having problems if you tell them you’re experiencing one or more of the
symptoms listed above along with confirmed bad sectors found by a tool like Microsoft’s Chkdsk.
Whatever tool you run, it’s vitally important that you back up any data from a suspect drive before
you run any of these scans, as they are quite intensive and may actually exacerbate problems in the
process of detecting them.

92 | P a g e
Chkdsk is normally run in one of two ways, depending on whether you can get your computer to boot
or not. If your computer can boot, you can initiate the scan from within Windows. In a Windows
Explorer window, go to Computer and right-click the drive you’d like to scan. Click Properties in the
menu that pops up. In the properties box for the drive, under the Tools tab, click “Check now” under
the Error-checking section, check “Scan for and attempt recovery of bad sectors,” and click Start. The
computer will then offer to schedule a disk check for the next time you start the computer; accept the
prompt and restart the system.

Finding your Chkdsk results in the Event Viewer takes a little digging.

When the disk check is done, it will display the results of the scan, but they’ll likely flash by so quickly
that you'll miss them. To see this log file after the fact, open up the Windows Event Viewer (type
“Event Viewer” into the Start menu’s search field and it should come up), expand the “Windows Logs”
drop down, and select “Applications.” The Chkdsk log should be near the top of this list (the Source
column should say “Wininit”), and if you scroll down under the “General” tab you should see the
results of your test, as shown in the screenshot below. If you see anything more than “0 KB in bad
sectors,” you should replace that drive—it’s not long for this world.

93 | P a g e
Running Chkdsk from the command line is actually a bit easier than using the GUI.

The other way to run Chkdsk is from the Windows install media. Boot to the media and before you do
anything, press Shift+F10 on your keyboard to bring up a command prompt window. Type chkdsk c:
/r (assuming the drive you want to check is drive C) and wait for the results. Again, anything more
than 0 KB in bad sectors on the disk means that it’s time for a replacement.

Display problems

Symptoms: Computer won't display output on monitor, display is corrupted or garbled.

94 | P a g e
Shattered LCD screens like this one typically aren't covered under warranty, since screens are most
often broken by being either dropped or stepped on.

Your first task in this case is to determine whether the problem is caused by the monitor or the
graphics processor; the tech on the phone is going to ask you if you've tried connecting the computer
to another monitor to test this. If output displays normally on the monitor, then the problem is with
the computer's screen; if output is still corrupted or garbled, you've got a problem with the GPU (or
perhaps with the memory, as noted above—make sure you run thorough memory diagnostics before
blaming the GPU). Telling the phone tech that you've tested with another monitor will usually get
them to skip their script and go right to fixing your problem.

In cases where the computer appears to be powering on normally but there's no output on the display,
it's not uncommon for the monitor's backlight to have gone out. If you put the display under a bright
light, you may be able to make out images and text. If you've got a desktop computer or an external
monitor that won't power on, double-check the power cable and power adapter the same way you
would for a laptop.

95 | P a g e
Motherboard

Symptoms: Most of the above.

Since the motherboard connects to all of the major components of the computer, its problems can
manifest in all of the ways we've already talked about. A bad RAM slot can make it look like you have
bad memory; a non-booting computer can look like a power issue; a bad integrated graphics processor
might look like a monitor issue. One of the best ways to diagnose motherboard problems, if your
computer doesn't come with built-in motherboard diagnostics, is to test all of the other components
first. When you have eliminated the impossible, whatever remains, however improbable, must be the
truth.

Mac troubleshooting

Diagnostics, OS 9 style: it's the Apple Hardware Test!

Troubleshooting a Mac is similar to troubleshooting a PC in most important ways, since the two have
shared basically identical internals for the last six years now (and even before the Intel transition, the
fundamentals were mostly the same). As such, the symptoms they present will be the same, and some
troubleshooting steps (particularly for display and power issues) will be identical.

96 | P a g e
Macs do use their own troubleshooting tools, though, and there are a few Mac-specific
troubleshooting steps that the AppleCare representatives are going to ask you to perform when you
call in.

Apple’s main diagnostic tool for Macs is the Apple Hardware Test, which ships with all new Macs and
has remained nearly unchanged (right down to the OS 9-style windows and buttons) despite the Mac
platform’s numerous software and hardware changes over the last decade. The tool can be accessed
by holding down the D key at boot, but if you've wiped or replaced your Mac's hard drive since buying
it, you'll need to go through a couple of extra steps: on older, pre-Lion Macs, you can find the Apple
Hardware Test on one of the restore DVDs that came with the computer. Newer Macs can access the
tool via the Internet thanks to the Lion Internet Recovery feature, again by holding down the D key as
the computer boots.

Performing the Extended version of the test is recommended, since it’s the one that is more likely to
find errors. It will do fairly robust testing on your memory, motherboard ("logic board" in Apple
parlance), and other components—take note of any error codes or messages you see while the test is
running. Other diagnostic tools (like TechTool Pro) have been developed for the Mac, and Memtest86+
will also work as a memory tester, but for the purposes of getting support, Apple tends to want error
codes and messages generated by its own tools.

You can use Disk Utility's verify and repair disk options to check for some hard drive failures—it relies
on the SMART self-reporting information from the drive to detect problems. My personal experience
with SMART has been a bit mixed—it has detected drive failures in time to save the data, but it has
also failed to catch errors before a drive or two became unreadable garbage. As is the case on PCs, a
good backup strategy is your best protection against drive failure.

97 | P a g e
Disk Utility can detect some errors, but a lack of surface scan options limits its usefulness.

There are also a couple of Mac-specific things you can do to get rid of minor, intermittent problems,
especially those related to trouble booting or powering on. Most Apple techs will ask you if you’ve
already performed these steps, so even if they don’t fix anything, you might as well try them.

The first such trick to reset your PRAM, which contains settings for virtual memory, the computer’s
start-up disk, and a few other settings. To reset the PRAM, turn your computer off, press and hold the
Command, Option, P and R keys, and power on the computer. Once you hear the start-up chime the
second time, let go of the keys, and see if your problems persist.

The second Mac-specific fix is to reset the System Management Controller (SMC), which can fix fan
speed and power issues (including both the system’s power lights and the ability to power the system
on)—a complete list is available on Apple’s support page - http://support.apple.com/kb/HT3964. On
desktops and older MacBook’s with removable batteries, resetting the SMC is just a fancy name for

98 | P a g e
power cycling a PC—unplug the system, remove the battery, and hold down the power button for a
few seconds to drain any residual power from the system. On newer MacBook’s with built-in batteries,
you must turn the computer off, then press (but not hold) the Shift, Control, Option, and power
buttons at the same time.

Smartphone and tablet troubleshooting

There's much less to say about tablets and smartphones than about some of the other hardware we've
looked at—the more appliance-like a device gets, the fewer things you can try to get it working before
you have to cave and call support or take it into your carrier’s store. If your device isn’t charging, the
steps for troubleshooting a faulty power adapter are the same as they would be for a PC or a Mac—
try a known good adapter with your device, or the suspect adapter with a known-good device—but
otherwise there simply aren’t many diagnostics to run.

If your power adapter seems OK but you can’t turn the device on, there are a couple of things to try:
if your phone or tablet has a removable battery, remove it and hold the power button down for a few
seconds and then replace the battery to cycle the power.

99 | P a g e
On iOS devices that don’t feature removable batteries, you can try to perform a hard reset by pressing
both the power and the home button simultaneously—if there’s nothing wrong with the phone, it
should turn on within a few seconds, at which point you can release the buttons. These
troubleshooting steps can also help you out if your device has frozen or become unresponsive.

If your phone or tablet powers on fine but is otherwise acting strangely, the first thing to do is to install
any updates to your phone or tablet’s operating system, if they’re available—these can fix everything
from battery life problems to performance issues to security holes or missing functionality. Most
phones and tablets these days can receive over-the-air updates without plugging into a computer, and
you can usually check for new ones from somewhere in the device settings.

If your problem is software-related, a factory reset should fix it right up.

If that doesn’t fix the issue (or if there are no updates available), your last resort on most of these is
going to be a complete reset, equivalent to wiping and reinstalling the operating system on PCs and
Macs. Before you do this, you'll want to be sure that any data the user needs has been backed up
somewhere—the presence of iCloud is helpful in this case, if the user has signed up for it; otherwise,
you'll have to deal with whatever sync software or service that particular device uses. Resetting the
software is normally done from within the Settings menu on most phones and tablets (Settings >
General > Reset on iOS devices, Settings > Backup and Reset on most Android devices). If the problem
persists after a reset, it's time to call support.

100 | P a g e
Activity 5

This activity will require you to become familiar with auditing features in a Windows based
operating system.

 Using the Internet, go to www.microsoft.com


 Using a combination of online documentation and search facilities within the site, find out
how to enable auditing of printing devices in the Windows XP operating system (note that
you don’t need access to a Windows XP system – only if you wish to test your findings).
 Write down the steps that you need to take in order to audit printing device usage and
statistics and consequently view the actual auditing data.

What are the steps you need to take in order to enable the auditing of a printing device?

101 | P a g e
Activity 5

102 | P a g e
Formulate a solution and make provision for rollback
Troubleshooting and Repair in an Apple Environment8

Try Quick Fixes

A quick fix is not necessarily the most likely solution to the issue, but because it is easy to perform and
involves little time or expense, it is worth trying. There is nothing more frustrating than spending hours
isolating an issue only to find out later that a quick fix solves it.

A quick fix is defined here as a repair action that:

 Can be performed quickly


 Involves little or no risk of harm to the system
 Has little or no cost

An experienced, efficient trouble-shooter will try one or more quick fixes before taking on the more
time-consuming tasks involved with isolating the issue.

NOTE

“Quick fix” does not imply a temporary, substandard, or sloppy repair.

Let's take another look at the printing issue we just considered. Possible quick fixes in this situation
include:

 Turn the printer off and back on again, then try to print.
 Restart the system and try again.
 Disconnect and securely reconnect the printer cable (being careful to follow safety
precautions).
 Take the paper out of the paper cassette and reinsert it to be sure that it is inserted properly.

These quick fixes take only a moment, involve very little risk of harm to the system, and involve no
expense. If the issue is not solved after you try these quick fixes, you can confidently move to the more
time-consuming task of logically and methodically isolating the issue.

Here are some more examples of quick fixes:

 Disconnect and reconnect power cables, printer cables, monitor cables, and so forth. (Make
sure that the Mac and its devices are turned off when you do this, except when dealing with
USB and FireWire devices, which are hot-pluggable.)
 Rebuild the desktop by holding down Command-Option as the Finder loads in pre–Mac OS X.

8
Source: Peach IT, as at http://www.peachpit.com/articles/article.aspx?p=420908&seqNum=2, as on 9th
December, 2016.

103 | P a g e
 Shut down the Mac completely (in as proper a manner as possible), wait at least 10 seconds,
and then turn it back on. Better yet, turn off the Mac and all of its connected peripherals, wait
a bit, then turn everything back on.
 Adjust physical user controls (such as brightness and contrast knobs on a display) as well as
software controls (such as the output volume setting in Sound preferences).

This is only a partial list. The situation and your experience will determine which quick fixes make
sense for troubleshooting the issue you are working on.

A good source of quick fixes is the troubleshooting symptom charts in the Troubleshooting lesson of
the product's service manual. You should consider any steps that fit the criteria for quick fixes.

As you gain experience, you will develop your own collection of quick fixes. The following section lists
other quick fixes that might be appropriate for systems running Mac OS X. Some of them can affect
data on the customer's system, so you will have to consult with the customer to determine whether
he or she has a current backup and weigh the advantages of the quick fix against the possible
inconvenience or time required.

Quick Fixes for Mac OS X

Mac OS X has a lot of settings and toggles that you can work with to help quickly determine and isolate
issues. There are so many, in fact, that you might not have discovered them all. To help you keep your
tests as low-impact as possible, we've broken the Mac OS X quick-fix tests into three categories, which
you should try in order.

Level 1: Innocuous/No Impact

 Restart or shut down.


 Run System Profiler.

TIP

If you have access to Service Source, you can check to see whether any of the Top Support
Questions look similar to the situation you're seeing. (You'll find these from the Service Source
main page, by opening the product menu and then choosing the product's support page.)

 Start up in Safe Mode (Mac OS X 10.2 and later), which loads only the minimum necessary
files and performs an elaborate directory check of the hard disk (which is why it can take a
long time to start up). After you hear the start-up sound, press Shift and hold it until the
progress indicator displays “Safe Boot.” To end the safe boot and get back to typical operation,
just restart as normal.
 Suppress Auto-Login (in Login or Accounts preferences) if you suspect that the issue lies within
the default user's system configuration, and then restart and log in as a different user.
 Suppress Login items by holding down the Shift key as soon as the Finder appears in Mac OS
X.
 Start up from a known-good disc such as Install Mac OS, Restoration, or MacTest Pro CD.

104 | P a g e
 Click Repair Disk Permissions in the First Aid tab of Disk Utility.
 Start up in single-user mode by pressing Command-S during start-up. The Mac should (after
displaying a bunch of technical text) show a UNIX command line prompt (#). Enter any UNIX
commands you wish, or type exit and press Return.
 Start up in verbose mode by pressing Command-V during start-up. This forces the Mac to
display text that explains what UNIX is doing before the customary graphical user interface
appears. You'll need to understand at least basic UNIX for this to be of any use.
 Start up in another Mac OS by selecting a different volume in Start-up Disk preferences.
 Relaunch Finder by Option-clicking the Finder icon in the Dock, and then choosing Relaunch
from the menu that appears.
 Disconnect all external devices.
 Turn off Screen Saver and Energy Saver (if troubleshooting an installation issue) in System
Preferences.
 Verify with other users (if troubleshooting a network issue).
 Connect to another device or volume (if troubleshooting a network issue).
 Connect to PPP test server (if troubleshooting a modem issue).

Level 2: Moderate Impact

 Adjust user settings.


 If troubleshooting a network issue, check the settings in the Firewall tab in Sharing
preferences.
 Choose Network Port Configurations from the Show pop-up menu in Network preferences.
Make sure necessary ports (such as Ethernet or AirPort) are activated.
 Check the Start-up Disk selection (if troubleshooting a start-up issue) in System Preferences.
 Force quit a troublesome application by choosing Force Quit from the Apple menu.
 Log in as a (new) test user in Accounts preferences. Since most user settings are tied to the
user account, you can create a new account with which you can test a more standardized user
environment, presumably with no conflicting or corrupted system resources.
 Launch the Disk Utility, select the start-up disk, click the First Aid tab, then click Repair Disk
Permissions. If any repairs were necessary, repeat the process.
 Move, rename, or delete potentially problematic preferences files. The applications that use
the preference files will automatically re-create clean copies as necessary.
 Update the printer driver (if troubleshooting a printing issue).
 Update the firmware for peripherals (such as AirPort Base Station or an internal optical drive)
if possible.
 Move a troublesome device from one port to another to determine whether the port or the
peripheral is at fault.
 Use known-good peripherals (for example, monitor, disk drive, printer).

Level 3: Invasive/High Impact

 Reinstall the suspect application.


 Reset the PRAM by holding down Command-Option-P-R at start-up until you hear the start-
up sound twice.
 Reset the PMU or SMU chip (see the service manual), but always reset the main logic board
before resetting the PMU or SMU a second time.

105 | P a g e
 Perform a recommended (default) installation of the Mac OS.
 Perform an OS Archive and Install.
 Perform an OS Erase and Install.
 Replace current RAM with known-good RAM.

Use Appropriate Diagnostics

Diagnostic tools are software packages you can use to check the performance of a system, determine
whether the system components are functioning correctly, and pin down the cause of a system issue.

See Lesson 2, “Software Tools,” for an overview of the Apple primary diagnostic programs, as well as
others from Apple and third parties.

TIP

The split-half search is a very helpful technique to use in the diagnostic process. See “Split-Half
Search,” later in this lesson, for full details.

Use Additional Resources to Research the Issue

At the start of this lesson you learned that, along with good troubleshooting technique, product
knowledge and experience are the basis for efficient, professional troubleshooting. If you have
completed the steps described so far and still can't determine the source of the issue, it is time to
research additional resources.

In situations in which you may not have in-depth experience or product knowledge, you can use such
references as Service Source and the Knowledge Base.

These resources are collections of the best information assembled by Apple. There is a good chance
that solutions to your issues are documented in one or both of these references.

Escalate the Issue

If you still cannot troubleshoot an issue despite your best efforts, you may need to escalate your
problem to Apple. How you do this depends on where you are located and the practices and policies
of your business or agency.

Repair or Replace the Faulty Item

After determining the source of a service issue, it is time to repair or replace the faulty item. There are
several steps that you must take before starting to replace software or hardware.

 Make a full backup of the customer's hard disk before updating, reinstalling, or otherwise
modifying the software on a system. This ensures that you can restore the system to its
original state if you need to do so.

106 | P a g e
 Use known-good software when modifying a system. Avoid introducing new issues while
trying to solve the original one.
 Look for the latest versions of software when updating or reinstalling software. This is
particularly important for System folder components such as extensions, control panels, and
peripheral drivers. At the same time, you should be careful not to add new software
components that can adversely affect applications and other software that the customer has
placed on the system.
 Follow all safety guidelines for working on computer systems. This includes powering down
systems before connecting or disconnecting peripherals.
 Observe all appropriate electrostatic discharge (ESD) precautions before working on
hardware. (You will learn about ESD in Lesson 4, “Safe Working Procedures and General
Maintenance,” and Lesson 10, “Cathode-Ray Tubes.”)

Verify the Repair by Testing the Product Thoroughly

Make sure that the computer is functioning correctly before you return it to the customer. Sometimes
you may fix one issue only to find another, or you may have fixed the right module but left a cable
unplugged when reassembling the product. To ensure a positive customer experience, thoroughly test
every product you repair before telling the customer it is fixed. You need to make sure that:

 The entire issue has been resolved.


 No new issues have been introduced during troubleshooting and repair.
 All elements of the system are compatible.

TIP

When verifying repairs for central processing units (CPUs), use MacTest Pro, Apple Service Diagnostic
(ASD), or Apple Hardware Test (AHT) to test the entire system, even if you repaired only one part of
the system. If possible, run looping tests for several hours, to catch any intermittent issues.

TIP

When verifying repairs for peripherals, if there is a diagnostic available for the product, use it! For
example, many printers have built-in self-tests; read the manual to determine how to initiate this
useful feature.

Verifying the Repair Exercise

Answer the following questions. If needed, refer to the previous section of this lesson as well as Service
Source, MacTest Pro, and the General Troubleshooting Flowchart.

1. You replaced the main logic board of a Mac that was having intermittent issues. The situation
seems to be fixed. How should you verify that the intermittent issues no longer occur?
2. A customer's iMac was not printing to a third-party color inkjet printer. You have reinstalled
the printer driver and generated a black-and-white test page on this printer. Do you need to
verify further? If so, what should you do?

107 | P a g e
Verifying the Repair Exercise Answer Key

1. Conduct looping tests of the system over an extended period using MacTest Pro or a similar
diagnostic.
2. Print a color test page. You have checked only part of the system's performance so far.

Inform the User of What You Have Done

Once you have returned the computer to normal operation (or escalated the issue), inform the user
of the work that you completed.

Keep in mind the following suggestions for giving your customer the best possible information:

 When verifying a repair with MacTest Pro, save the test log. You can show the log to your
customer as evidence that you have tested the system thoroughly and that it passed the tests.
Test logs are not available for ASD and AHT.
 Print out other diagnostics that you have completed and show them to the customer.
 Explain any steps the customer can take to avoid having situations recur. For example:
o If the customer has shut off the system incorrectly, explain the hazards of not shutting
down properly.
o If the customer's system was made unusable by a virus, teach the customer how to
avoid viruses in the future.
o If the customer has lost data, describe some ways to back up information.

The basic idea is to give customers information to improve their computing experience. Taking time
to teach customers how to avoid future issues adds value and improves their experience.

Complete Administrative Tasks

Each Apple Authorized Service Provider (AASP) has different administrative procedures for
documenting service and handling parts. How you complete the administrative tasks for servicing an
Apple product depends on where you are located and the internal policies of your business or agency.

Split-Half Search

A split-half search is a technique for systematically isolating the source of an issue. You start by
eliminating roughly half of the items you are checking, then trying to re-create the issue. You continue
halving your search group until you find the source of the issue. A split-half search requires applying
your knowledge of the product, its common issues, and the symptoms as you check one possible cause
after another, in a logical order.

This part of the troubleshooting process can be the most difficult and the most time-consuming. That's
why a logical and methodical plan is so important. We've found that the following order has been
effective:

108 | P a g e
1. User errors. Check for user errors in the course of gathering information, duplicating the issue,
and trying quick fixes. But keep in mind the possibility of incorrectly set switches or
preferences, incompatible equipment, and incorrect assumptions on the user's part; take
nothing for granted.
2. Software-related issues. Software that is unusable or that doesn't work with other software,
viruses, extension conflicts, duplicate System folders, and other software issues can cause
symptoms that may look like hardware issues. But replacing hardware won't solve these
problems, and it wastes time and money, so always check for software issues before replacing
any hardware. MacTest Pro system software tests can detect and repair many software issues
of this type. Remember that you must check applications and the Mac OS.
3. Software viruses. As you most likely know, a virus is a program that replicates itself and often
modifies other programs. When a virus gets into system software, the computer may not start
up, the system may stop responding, or the software may work incorrectly. (It may be helpful
to define this for a customer who really isn't sure why a virus could be such a big issue.)
Although Macintosh computers are less likely to become infected with viruses than computers
running other operating systems, it is still possible to get a virus on a Mac. Email attachments
and other files downloaded from the Internet are common sources of virus infection.

To check for a virus, ask customers these questions:

o Did you recently receive software from another user or a common source and add the
software to your system?
o Did you experience the issue before you obtained the software?
o Did you share this file with others? Are they having similar issues?

You can find up-to-date virus information on the Internet at a variety of locations. Third-party
virus utilities such as McAfee Virex (www.networkassociates.com) can check systems and
remove viruses from them. Virex is available as a free download to all .Mac subscribers.

If you do detect a virus, make sure you find the original source file and delete it. Then reinstall
all affected system and application software, and dispose of any unusable data files.

4. Hardware issues. When you are convinced that user error, a virus, or other software has not
caused the issue, hardware is what is left. Here are some tips:
o Simplify the issue. Remove external devices and internal cards (except the video card,
if needed for display) and test the main unit alone. If the issue remains, you have
isolated it to the computer itself. If the issue disappears, reinstall the cards and
peripherals one by one, until the symptoms reappear. When they do, you have found
the culprit—or at least a clue.
o If the system can be tested with AHT, do so. This can often save you considerable time
when checking for hardware issues.
o Find the “problem space.” Try to identify the functional area—sometimes called a
problem space—that the issue affects. For instance, the general functional areas for
a typical Mac could be considered software, logic and control, memory, video,
input/output (I/O), and power.

109 | P a g e
If you can narrow down the issue to, for example, the video area, you can narrow your search to the
parts that relate to video: the monitor, cables and connectors, video random-access memory (VRAM),
video card (if present), and logic board.

 Inspect components, especially mechanical parts and fuses. You may be able to see the cause
(a blown fuse or a visibly defective chip), smell it (a burning smell is often a tip-off), or hear it
(grinding noises are seldom a good sign).
 Work from largest to smallest components of the system. Avoid the urge to go straight to the
heart of the issue. Instead, methodically work your way down, testing along the way.
Gradually narrow the focus of your search. For example, if you suspect there is an issue with
a component of the System folder, you would want to first check the complete Mac OS by
starting the system from a known-good CD with the same version of the Mac OS. Only when
you know that the rest of the computer system is working correctly would you want to start
investigating the components of the Mac OS.
 When testing, test only one thing at a time. It is ultimately more efficient to methodically test
one thing at a time and move on to the next than it is to try two or three things at once. This
means reinstalling the original part if a replacement part does not correct the issue. Always
work from the original system at each step.

If a test you perform does not reveal the source of the issue, restore the system to the condition it
was in and move on to the next test. Make a note of what you just tried and what the result was.

If you have backups of the files you update or replace, you can return the system to its former state
after each test.

NOTE

The symptom charts in the service manuals provide step-by-step instructions on what to do for specific
issues.

Good Technique

1. Disconnect external USB and FireWire devices.


2. Check result.
3. Reset PRAM.
4. Check result.
5. Start up from another System folder (such as a bootable CD).
6. Check result.

Bad Technique

1. Disconnect external USB and FireWire devices, reset PRAM, and start up from another System
folder.
2. Check result.

110 | P a g e
Component Isolation

One of the major difficulties in troubleshooting computer systems is pinpointing the specific cause of
a hardware issue.

Previously, you learned about diagnostic tools and references that can assist you in determining
whether an issue is caused by hardware or software. In addition, earlier in this lesson we reviewed the
general principles of troubleshooting that Apple advocates.

With that knowledge under your belt, you should be at a stage where you can tell whether you're
seeing a software issue or a hardware issue. You should also be prepared to systematically address a
troubleshooting issue. We will now look at a method that can aid you in identifying faulty hardware
components. This procedure is called component isolation—a technique with which you can
accurately and decisively determine the source of hardware issues.

Here's how it works: Using a minimal system, you start up a computer and observe its behaviour.
Armed with an understanding of the normal power flow sequence (discussed later in this lesson), the
symptoms you observe may direct you to add or replace components in a specific sequence until you
can determine the hardware component that is causing the issue.

You should not confuse this procedure with randomly swapping modules until a system finally works.
As you will see, component isolation works in a much more systematic manner.

You should use component isolation:

 When you are attempting to isolate intermittent, hard-to-find hardware issues


 When other approaches have not worked, and you need to make sure that the system
hardware is working correctly

NOTE

Component isolation requires an ESD-compliant work area, appropriate tools for taking apart the
product you are testing, and job aids identifying components of the system and the steps of the
procedure for that system.

How Does Component Isolation Work With Diagnostics?

The last lesson introduced you to a variety of Apple and third-party diagnostic tools. All of these
diagnostics can give you indications of defective hardware components, but no diagnostic software is
accurate every single time. That's why experienced technicians use multiple diagnostics to verify a
particular finding.

Component isolation offers a fail-safe method of confirming that hardware components are
functional. It should be considered as a companion technique to diagnostic software.

111 | P a g e
Understanding the Power Flow Process

The basis for the component-isolation troubleshooting technique is an understanding of power flow
within computers.

When a computer starts up, many different activities occur. All of these activities rely on the correct
flow of power within the system.

Let's look at a Power Mac G4 (Quicksilver) as it starts up. The following steps are a very simplified
description of a complex process. Nevertheless, these simplified steps will assist you in understanding
component isolation.

When you press the power button on a Power Mac G4 (Quicksilver):

1. Power flows through the power cord to the power supply. If the power cord or power button
is defective, the system will not start up.
2. The power supply feeds power to the main logic board. If the power supply or the connection
from the power supply to the logic board is defective, the system will not start up.
3. The logic board in this Mac model feeds power to a CPU card. If the logic board or the CPU
card is defective, the computer will not start up.
4. The logic board feeds power to the RAM as well. If the RAM is defective, the computer will
not start up. Instead, you will hear an error sound for defective RAM.
5. The logic board sends a startup sound or signal to the speaker assembly in the front panel
board if the power-on self-test (POST) is successful. If this startup sound occurs, you know that
the components in this power chain are working correctly. If you don't hear a startup sound,
the speaker could be disconnected or defective, or the speaker volume may be turned down
or muted altogether.

Starting With a Minimal System

In the description of power flow, we made no mention of hard disk drives. This was intentional,
because when setting up a minimal system for the component-isolation technique, you start with only
the components necessary to hear a start-up sound or see a flashing question mark on the monitor.

You do not need a hard disk drive when testing power flow in a minimal system. The POST does not
rely on any components of the Mac OS residing on the hard disk. Likewise, if you have a working power
button on the Mac itself, you do not even need a keyboard.

A minimal system is exactly that. For a Power Mac G4 (Quicksilver), for example, the minimal system
consists of the AC power supply (including, of course, a power cable), logic board, front panel board,
speaker assembly, and CPU with heat sink. All other devices should be disconnected, although it's not
necessary to physically remove them from the computer unless they're in the way.

NOTE

112 | P a g e
For some Macintosh models, RAM is not a required component because a minimal amount of RAM
may already be part of the main logic board on those models.

Minimal System Chart

The following diagram is a component-isolation job aid for the Power Mac G4 (Quicksilver and AGP
Graphics), showing which components on the logic board must be disconnected and which
components must remain connected.

Note that some of the required minimum components are not on the logic board. For example, the
CPU and speaker assembly are on the logic board, as you can see below, but the front panel board
(which controls the power button) is not.

NOTE

113 | P a g e
You must shut down the computer before you disconnect or connect any modules except external
USB and FireWire devices.

Component-Isolation Procedure

Students in the AppleCare Technician Training (ATT) program are asked to perform the following
procedure to reduce a Power Mac G4 (Quicksilver and AGP Graphics) to its minimal configuration for
testing. If you are studying this on your own, the preceding diagram gives you the information you
need to complete the procedure, but you should be cautious about going ahead without skilled
supervision.

NOTE

The component-isolation procedure is not the same on all Mac models; do not attempt to extrapolate
this lesson to other models. You will find sample component-isolation charts on the CD.

1. Reduce the machine to the minimal configuration.


2. Press the power button. You should hear a start-up sound. A start-up sound means the
minimal configuration is working. If you get no start-up sound, the logic board is probably
corrupted or another module in the minimal configuration is faulty.
3. If you do not get any sound from the minimal system, verify that the power supply is working
according to the instructions in the service manual. (For Power Mac G4 systems, you can find
this procedure in the Troubleshooting section under “Power Supply Verification.”) If the
power supply does not check out as specified in that procedure, replace the power supply
with a known-good component.
4. If you still do not get any sound from the minimal system, reset the PMU and restart. If the
PMU chip reset has no effect, perform a logic board reset by removing all power from the logic
board for at least 15 minutes.

NOTE

Resetting the PMU two times in a row can cause issues with the PMU, resulting in the rapid
and complete draining of the Macintosh computer's internal battery, which will then require
replacement. To avoid this situation, always reset the main logic board before resetting the
PMU a second time.

5. If you still do not hear a start-up sound, remove the RAM DIMM, reset the PMU and logic
board, and restart the system. If you get an error sound signifying a memory error, replace
the memory with known-good RAM and restart the system.
6. If none of these steps has corrected the start-up issue, install the video card and connect a
known-good monitor. Reset the PMU and main logic board, and restart the system. If you get
a flashing question mark but no sound, the issue is probably with the speaker assembly or the
front panel board.
7. If you do not hear an error sound or see a flashing question mark, then replace the CPU with
a known-good component, reset the PMU and main logic board, and restart the system.
8. If the system still does not start up, replace the main logic board with a known-good
component. Reset the PMU and logic board, and restart the system.

114 | P a g e
9. If you hear the start-up sound, install the video card (if it is not already installed) and attach a
known-good monitor to it. Restart the system and look for a flashing question mark. If you see
that image, the video card is working correctly.
10. By this stage you should have achieved minimal configuration. (You don't have to reset the
PMU and logic board once you have achieved minimal configuration unless another service
issue appears.) Reattach the internal hard disk drive to the connector on the main logic board.
Restart the system. You should see a normal start-up screen for the Mac OS installed on that
system.
11. Continue to add any additional components and peripherals one at a time—with power off,
of course. If a service issue appears while adding other components, you should go back to
the last stage before the service issue appeared, reset the PMU and logic board, and recheck.
By this point, you will have a good idea what the service issue is.

NOTE

You should also examine all cables and connectors carefully when a symptom appears. Always check
cabling with known-good cables before replacing any component that uses a cable to connect to the
logic board. A bad connection due to a defective cable looks just like a bad module.

Systematically test variables until the problem is isolated


Systematic approach to specific hardware problems9

What if there appears to be no power?

If your computer fails to power up (you can hear/see no fans running and no lights are lit when you
press the power button), check the following:

 Make sure that you're pressing the power button on the front of the computer and not the
reset button. If this is a new installation, try pressing the reset button to make the computer
turn on. Some people have installed the switches on the wrong terminals of the motherboard.

 Make sure that the power cord is securely plugged into both the power outlet and the
computer's power supply.
 Make sure that the main power switch on the back of the computer's power supply is on (set
to the '1' position).
 Make sure that the outlet you have the computer plugged into has power. Unplug the
computer and plug a lamp into the same exact socket that the computer is plugged into. If the
lamp lights, the outlet has power. If the lamp does not light and it's plugged into an outlet
strip, plug the lamp into the wall socket. If the lamp lights when plugged directly into the wall
socket but not when plugged into the outlet strip, be sure that the strip's power switch is on.
If it has a circuit breaker, be sure that the breaker has not tripped.

9
Source: Perry Babin, as at http://www.bcot1.com/troubleshooting01.html, as on 9th December, 2016.

115 | P a g e
 Make sure that the power switch on the front of the computer is working properly. The power
switch should short (connect) the two wires together when you depress the switch. To check
this, you'll need a multimeter (set to ohms). You'll place one probe on each wire. If your meter
probes can make direct contact with the metal terminal in the plastic connector, do so. If you
cannot, you will need to insert two fine, solid wires as is shown in the second photo. The wires
are pushed all of the way through to hold them in place and to help ensure that they're making
contact with the terminal in the plastic connector housing.

You will then touch the probes to the wires to check the switch. Ideally, the ohm meter should
read 0 ohms when you depress the switch but many cases use poor quality switches that will
not go below ~20 ohms. As long as it goes below ~100 ohms, it's probably OK. Do not let your
fingers come in contact with the probes or the wires in the switch housing.

For those who don't entirely understand how to use a multimeter, you need to set it to 'ohms'
(sometimes indicated by a symbol that looks a bit like a horseshoe). If it's not an auto-ranging
meter, you need to set the meter to the highest ohms range. With the metal part of the probes
not in contact with anything but air, make a note of what the display reads (often OL or 1).
Now, touch the probes together and make a note of that. Bridge the probes with your finger.
It will display a varying value. When you have the meter probes touching the two wires from
the power switch (or the reset switch, they function in the same way) and the button is not
being pressed down, the meter should read the same as 'air' (as described above). When the
button is pressed down, it should read the same as when the probes are touching. It should
NOT read anything like when you touched your fingers across the probes.

116 | P a g e
This is very important, especially if your computer is crashing for no apparent reason. I've had
several switches which were leaky (reading as when you bridge the probes with your finger).
This caused the computer to crash randomly.

You can use a piece of wire (inserted into the back of the plastic connector housing) to short
the terminals (to attempt to power up the computer) if you think the switch is at fault. Of
course, the connector has to be on the motherboard when you do this. If you are careful and
don't short to any other pins, you can short between the two pins for the power switch with
anything conductive. If you have a shunt (like the ones used as jumpers on the motherboard
or hard drive), you can use that to temporarily (momentary contact is all that's needed) short
the power switch pins on the motherboard. I generally use either my meter probe (example
below) or a jeweller’s screwdriver.

117 | P a g e
Note:
Many times, the motherboard isn't marked clearly (or at all). The owner's manuals for virtually
all motherboards are available for free in PDF format online. Try Googling the model number
of the board (generally printed prominently on the board) and 'motherboard manual'. If you
haven't yet seen a manual for a motherboard, the one for the board above is at
http://www.bcot1.com/N61PB-M2S_090918.pdf. I'd suggest that you only download from
the manufacturer's web site if possible. Many of the other 'manuals' sites are infected with
malware or force you to jump through hoops to get to the download page. If the download is
an exe file, you probably shouldn't download it. It could be infected.

For times when you can get the computer to power up but need other information about it,
try downloading PC Wizard from CPUID.com. PC Wizard can tell you virtually anything you
want to know about the computer.

If the lights light up and the various fans spin up when you press the power button but you get no
display, check the following:

 Be sure that the monitor is on.

 Confirm that you have power to the monitor.

 If you have an LCD monitor, there may be a small power supply (typically a small rectangular
black box) between the wall plug and the plug that connects to the monitor's power supply
input. Be sure that the plugs on the power supply and monitor are securely seated.

118 | P a g e
 Make sure that the monitor is plugged in securely to the video connector on the back of the
computer. Some monitors have a video cable with removable connectors on both ends so be
sure to check the connection on the back of the monitor also.
 If the monitor still has no display, see if the on-screen display for the monitor's controls can
be displayed. You may need to find the owner's manual for your monitor to learn how to
access the on-screen display. If the OSD works, the monitor has power but it is possible that
the monitor is faulty. If possible, try another monitor on your computer. If another monitor
works properly, your monitor may be defective. If another monitor (that's known to be in
good working order) doesn't work, you may have problems with your video card or with the
video drivers. Try booting to safe mode. If you have video in safe mode, you likely have video
driver problems or the video settings are not set to work with your monitor (if the settings are
not correct, the monitor will typically alert you to that problem).
 If the operating system is causing the problem, shut the computer down (hold the power
switch on the front of the computer down until the fans stop or disconnect power from the
computer). Make sure the monitor is on (power LED will be on but likely amber instead of
green). Depress the power switch to start the computer (reconnect power if you had to
disconnect it to shut the computer down).
 After the power switch is pressed, you should hear several fans start and you should see
simple (black and white) text on the screen as the computer reboots. If you do, you know the
monitor is working. If you don't, the monitor or the monitor cable may be defective. The
following is from a computer that didn't have a hard drive in it so it couldn't boot up. A working
computer won't show this type of error and will go through this screen so quickly that you
won't notice it.

If you've recently installed new graphics card, make sure that the monitor cable is plugged into the
new card. Many computers have two or more graphics ports. If it has one on the motherboard's IO

119 | P a g e
panel and a second one in one of the expansion slots, the one in the expansion slot is almost always
the one that's going to be used by the computer. If the secondary graphics card is anything other than
a basic card, it may need an external power source (from your computer's power supply). Check the
installation manual for the graphics card.

Loose Connections:

Many times, a computer will not power up simply because one of the connectors on one of the
components inside the computer isn't properly seated. If the computer was moved (or dropped) just
prior to it becoming inoperative, there may simply be a loose connection). To check this, unplug the
AC mains power plug from the computer's power supply (this is very important). Pull the left side cover
off of the computer. On all connectors, gently press them to be sure that they're properly seated.
There will likely be several power supply and IDE connectors that are not to be connected so don't be
alarmed if you see connectors that are unplugged. After you've checked all of the connectors, replace
the side cover, plug the power cord into the computer's power supply and try to power up the
computer.

Sometimes, memory problems cause system instability (crashes). If you want to test the memory,
download a program like memtest and burn it to a CD. It's a bootable CD so if it's in the CD/DVD drive
when you boot the system and the BIOS setting tell the computer to try to boot to CD first, the
memtest software will run automatically. If it makes it through an entire pass, the memory is likely
OK.

Power Supply in Protect Mode:

If, when you press the power button, the computer's fans start to turn, then shut off, you could have
power supply problems or there may be something loading down the power supply (causing it to go
into protective shutdown).

When you press the power switch, the green wire in an ATX power supply is connected to ground (by
the motherboard, not directly grounded by the switch). At that point, the power supply is switched on
and all of the various voltages are generated. When the various voltages are within a specified
tolerance, a signal is generated on another of the power supply pins (the power OK pin) and the
processor is then switched on. All of this takes well under a second and most people never realize that
it's happening.

ATX power supplies can be tested by disconnecting them from the motherboard and connecting the
green wire to any of the black wires. This will turn the supply on. At that point, you can test the
individual voltages. If all of them are close to the rated voltage, the supply is likely OK. If the power
supply tried to start but wouldn't start (or wouldn't run for more than a second) when it was plugged
into the motherboard, you should suspect that something is pulling too much current. Plug the power
supply back into the motherboard and disconnect all of the drives' power connectors as well as any
accessories such as fans. Again, try to power up the computer.

120 | P a g e
If the PS starts normally, begin reconnecting the power supply to the accessories and drives (one
piece at a time and trying to start the machine after each piece is plugged in). If you find that one piece
is preventing the machine from starting, you've likely found your problem.

Note:
I've read that some power supplies require a load on one or more of their outputs to power up. I
haven't encountered this yet but you should be aware of it.

You may not realize that the actual output voltages are rarely precisely at the rated voltage. The
following voltages are from a relatively inexpensive supply. The variation from the rated output that
you see here is relatively common and this supply will work fine in most computers. Of course, when
the power supply is loaded by the computer, the voltage WILL change somewhat. This is an UN-loaded
power supply. As a side note, sometimes the output voltage of some of the outputs will actually
increase when the supply is loaded. This often happens when the regulated output (generally the +5
volt output) is loaded down. The reason it happens is that the pulse width of the power supply is
increased to maintain the regulated 5 volt output. When the pulse width is increased, all of the other
outputs will increase (the 5v output should remain constant). Recently, some power supplies have
begun employing two independent sets of regulators. This will help keep more of the output voltages
within a tighter tolerance.

^ Orange ^

121 | P a g e
^ Yellow ^

^ Red ^

122 | P a g e
^ Blue ^

^ Gray ^

123 | P a g e
^ Violet ^

If the machine will not start with no accessories plugged in but the power supply powers up fine with
the green wire shorted to the black wire (don't try shorting green to black while it's connected to the
motherboard), the processor may be dead. This is a somewhat difficult situation. When
troubleshooting computers, it's common to simply replace the questionable component with a known
good component.

Since you won't know whether the motherboard or the CPU is defective, you may want to plug the
questionable CPU into a known good test board. If the test board has insufficient protection and the
CPU is shorted, the PWM regulator in the test board could be damaged (leaving the board irreparably
damaged). If you plug a good CPU into a questionable board that has a defective PWM regulator, then
you will kill the test CPU. Many times, if you have a blown CPU, it's best to replace both the CPU and
the motherboard. If you decide to swap the CPU to check it, make absolutely sure that you have the
correct heatsink properly installed BEFORE applying power to the unit. Some CPUs have no thermal
rollback/shutdown protection and if the heatsink isn't properly installed, the CPU will fail within
seconds.

Improperly Seated Memory Modules

One very common problem (after a computer has been moved) is improperly seated memory
modules. If the computer won't boot up or gives a memory error on the BIOS screen, shut the
computer down, remove and re-seat the memory modules. If it still gives an error, you may have to
remove all of the modules and install only one at a time (in the various memory slots). If you find that
one memory module or one memory slot is causing the error, you will need to avoid using it.
Remember to support the board from the back as much as possible when seating the modules.

124 | P a g e
What if it won't boot after a NEW install?

When you assemble a new machine or install a major component (motherboard, CPU...), sometimes
it won't boot up the first time you turn it on. The problem could be a defective component, a connector
that's not properly seated or a BIOS/motherboard setting that isn't right. These are just a few of the
possibilities.

 If this is the first time that you've attempted to power up a newly assembled computer (and
therefore have made no changes in the BIOS), you can try clearing the BIOS. This will return it
to the original state. I've had a couple of boards that refused to do anything when they were
initially booted up and clearing the BIOS allowed them to power up. Do not do this if you're
working on a computer that only recently became inoperative and uses a RAID array for the
hard drive. If you do and you're not familiar with the original setup, you may cause all data to
be lost on the hard drives. Of course, this doesn't apply to a new system that has never booted.
 Be sure the memory is properly installed/seated. If you can find no other problems, you could
have a defective or incompatible stick of memory. There have been times when a stick of
memory wouldn't work in one machine but a different stick (from a different manufacturer)
with the same specs would function properly. The stick that wouldn't work in the first machine
was fine in other machines. The stick that wouldn't function properly fully passed all memory
tests without errors.
 Be sure you have a working graphics card installed. Generally, the POST (Power On Self-Test)
beeps will tell you that you have a defective or improperly installed video card. The beeps
vary with different BIOS manufacturers and you should do a Google search for 'POST beeps'
to determine what they mean for your BIOS. For Dell computers, they have 4 indicator LEDs
to help diagnose problems. You can find the codes by going to
http://support.dell.com/support/edocs/systems/dim2350/advanced.htm or searching for
‘Dell diagnostic code' on Google.
 When you can't determine what the problem is, reduce the system to its most basic form. The
system only needs memory, the processor and the power supply to power up and post. Of
course, if the board doesn't have on-board graphics, you'll also need to have a video card. For
this level of testing, you should not even have the mouse and keyboard plugged in. It may also
be good to disconnect the power and reset switches from the motherboard. If either is stuck,
it could cause problems.

What if it won't boot for no apparent reason (simply quit booting)?

Sometimes, a computer will simply fail to boot. If you haven't done anything that could be causing
problems (installing/updating software/drivers), some of the system files may have become corrupted
or some piece of hardware may have failed. If the computer starts to boot (it shows signs of life), it
means that the power supply is probably OK. If it repeatedly fails to boot into Windows, you need to
try booting to 'safe mode' (covered earlier in the tutorial). If it boots into safe mode, it indicates that
the computer's hardware is likely OK and the problem is probably software/driver related. Unless you
want to go through extensive troubleshooting, the best thing to do is to go back to the last restore
point. Restore Points were covered earlier in the tutorial.

125 | P a g e
What if it quit booting properly after installing a new piece of hardware?

If you installed a new piece of hardware to upgrade an older (but still functional) piece of hardware,
the first thing to do is to remove the new piece and reinstall the old component. If the system again
begins to boot and works as it did before, the new piece of hardware could be defective or causing
some conflict in the system. Its drivers could be corrupt (if you downloaded them) or they may have
some sort of incompatibility with your other hardware/software. If you're not going to try to reinstall
the new piece of hardware, you should uninstall the drivers for the component that you tried to install.

If the machine will still not boot after reinstalling the old component, you may have pulled a connector
out of its socket. Try pushing on all of the connectors on all components (don't forget the memory
modules) to be sure that they're all properly seated. If that doesn't work, try booting into safe mode.
If it boots there, your hardware is very likely OK and you simply have something loading during boot-
up that's causing the system to crash. At this point, you can do a couple of things, you can go back to
the last restore point (it should have been created when you installed the drivers for the component
that caused the problem). You can boot in safe mode and uninstall the drivers that you recently
installed and/or you can go through the start-up list (run >> msconfig >> startup) and start removing
components until the system boots. If you're removing items from the start-up list, begin with the
ones that seem to be related to the hardware that you recently installed.

What if it quit booting properly after installing a new piece of software?

If the computer attempts to boot but crashes repeatedly, boot to safe mode and go to START >>
CONTROL PANEL >> ADD AND REMOVE PROGRAMS and uninstall the software. We covered the
procedure to uninstall software earlier in the tutorial. If the computer still refuses to boot, you may
have to go back to the last restore point.

Miscellaneous software problems

One of the most important tools for troubleshooting strange software problems is Google. There is no
way that any one web site can cover every problem. If you have a problem, take note of the EXACT
wording of the problem (take and save a screen-cap if you have a poor memory) and search Google.
If you enclose the error statement in parentheses, it may make the search more successful.

You may have to follow several links because many of the solutions offered on the forums or in the
newsgroups will not be the solution for your problem. If you're having trouble with a specific piece of
software, try the software's home page. Look in the 'support' and 'FAQ' sections of the site.

Using an Alternate OS

Sometimes, a computer will not boot due to a corrupt file in the OS or a defective hard drive. The
problem you face is determining which is the culprit. One tool that you can use is a 'live CD'. A Live CD
is a bootable operating system (generally a Linux distribution or a stripped down version of Windows).
My favorite is the Hiren Boot CD. It's not always easy to find but if you can find a copy that has mini
Windows XP on it, that's the one that I'd suggest that you use, especially if you're working on a
computer that was using a Windows operating system. With the live CD, you can go in and confirm

126 | P a g e
that the drive is accessible and the files are generally intact. If, for some reason, the operating system
is damaged, you can use the live CD to move the files to a different drive (either a USB flash drive or a
second hard drive). If you're going to have to reload the operating system from the Windows
installation disc, it's best to make a copy of all important files that are on the same partition as the
operating system (generally the partition normally labeled C drive). Sometimes these files will be lost
when reloading Windows. If you're going to restore the partition from a recovery file like those
produced by True Image or Ghost, EVERY file in the restored partition will be wiped out.

If you can't find a good live CD, you can remove the drive and connect it to another computer. I
generally keep a working but generally unused computer around for this. Although it's unlikely, it's
possible for malware to infect the computer when the drive is connected to the system. When you do
this, you should boot the computer and then connect the drive. If it's a SATA drive it may automatically
be recognized. If it's an IDE drive, you will have to go to START >> RIGHT-CLICK MY COMPUTER >>
select PROPERTIES >> select the HARDWARE tab >> select DEVICE MANAGER >> RIGHT-CLICK DISK
DRIVES >> select SCAN FOR HARDWARE CHANGES. You should see the new drive appear in the list of
disk drives. When you go back to Windows Explorer (My Computer), the drive should show up in the
list of drives. From there, move the files you need to save.

If the owner of the computer has a lot of important files and can't remember where they all are, the
best option may be to simply buy a new hard drive (~$39+ shipping) and load the OS onto the new
drive. Then, the owner can move the files to the new drive as they find them.

If you do this, make two partitions on the new drive. Make one for the OS. 40-50GB should be plenty.
That will leave at least 25GB for the owner to store her files on. It's best (in my opinion) that you not
store important files on the OS partition. If you're read the Backing Up Your Hard Drive page, you know
that it's also important to have additional backups of important files.

Cleaning a Dusty Computer

When working on computers, you'll find that many of them have pulled in a lot of dust/dirt/bugs... If
you know that the computer is going to be repairable, it's best to clean it before you begin working on
it. If you're uncertain as to whether it will be repairable, you may want to wait and clean it after it's
up and running.

When cleaning a dusty computer, the best tool is compressed air. The compressed gas that you can
buy works but not as well as air from an air compressor. If you're going to use air from an air
compressor, you must make sure that the air coming from the compressor is clean and dry.
Compressors set up for mechanics often have oilers on them to keep the air tools lubricated. If you
live in an area that has high humidity, you'll likely need a water trap on the compressor. Before using
the compressor to blow out the computer, you'll want to blow air onto a clean dry surface for about
30 seconds. If the surface is dotted with water or oil, it's not suitable to use on any electronic
components.

Before you begin to blow out the computer, you'll need to remove the optical drives. If you leave them
in, dust can be forced into them and can make them malfunction.

127 | P a g e
When you blow the computer out, you'll definitely want to do this outside with the wind/breeze
blowing so that the dust is blown clear of you and anything in the area. Some of these computers can
be quite dirty. When you do this, you should disassemble the computer as far as possible (remove
both side covers, front bezel...) When blowing the computer out, blow from many angles. For fans,
blow from both sides (being careful not to spins the fans too fast). When blowing the power supply
out, you'll need to blow repeatedly from both the intake and output sides from every conceivable
angle, until no more dust can be blown from it. Expect to do this for 5 minutes or more for really dirty
computers. When it's clean, there should be no dust coming from the computer, no matter what angle
you blow into the case.

If the computer was in the home of a smoker, the dust will probably be very sticky and be difficult to
remove. For critical surfaces like the heatsinks, you may need to remove them from the computer and
wash them with a stiff brush, soap and hot water (after the fans have been removed).

In the case of live infestations of bugs (namely, cockroaches), it's often better to let someone else deal
with it. If the cockroaches are alive, they can infest your home or shop and be difficult to get rid of. In
cases where there is a heavy infestation in the computer, the smell can be awful and very unpleasant
to work around. If you get a computer that's infested with live roaches you should immediately spray
the computer with insecticide and put it in a garbage bag that you can seal up (so they can't escape).
Many insecticide sprays leave an oily residue and are not well suited for this. If you can find Bengal
roach spray, use that. It's VERY effective and is a dry powder. Be aware, however, when you use
Bengal, roaches will try to escape it and will exit from every hole in the computer.

If you think that you'll have to work on the infested computer again, seal every entry-way that you
can find. Tape those that don't need to be open and install fiberglass window screen over the rest of
the openings (like those for fans). Tell the owner about the infestation and recommend that they hire
an exterminator or apply insecticide themselves (recommend the Bengal, it works). This is important
because cockroaches can cause a lot of damage to the other electronic equipment in their homes.
Even if it's repairable, it may not be possible to find a service technician that will repair it for them.

Suggestions for Working on Laptop Computers

Working on laptop computers isn't something that I'd recommend that you do unless you're very
confident in your abilities or you don't absolutely need the computer to work when you finish. It's very
easy to make a mistake that can cause significant damage to a laptop computer.

If you damage the motherboard, there is little chance that the computer will be repairable. There are,
however, a few things that most people can do if they're careful (tighten a loose monitor bracket,
replace the power jack, change the processor, clean the heatsink and fan for the processor or graphics
adapter...). The most common problems with laptop computers (that require partial or complete
disassembly) are broken power connectors, defective keyboards and failed drives.

Laptop computers are held together with a combination of screws and locking tabs. The tabs are
generally located around the perimeter. To release them, you have to push one half of the case in or
out along the point where they meet. Using your fingernails to do this may work for some but the
proper tools work better. You can use the same types of tools used to take apart the dashboards of
cars (available on eBay or from Harbor Freight). The 'black stick' spudger also works well.

128 | P a g e
The black spudger below is a Menda #35622. You can find them for less than $2 each. Buy several. You
may break one or two until you learn how much stress they can take.

One of the most difficult thing to remember when repairing laptops is where each of the screws goes.
Here, a camera is your best friend. If you have a video camera, set it on a tripod and allow it to record
the disassembly of the computer. Record at the highest resolution possible. If you don't have a video
camera, photos are just about as good. Take a photo of each side of the computer and take new photos
each time that you remove something that reveals another area that was not previously visible. As
you disassemble the computer, you will see that there are many different screw sizes (length and
diameter). Use the photos that you take to make a diagram of the screw locations. Start with #1 for
the first screws that you remove and mark #1 on the photo. Use a divided container to keep the screws
organized. The best that I've found are the 14 day containers. Mark the container from 1-14 (15-28 on
the second container, 29-42 on the third...) and place the corresponding screws in the various slots.
You will need several of these for most laptops. When placing the screws in the individual slots, only
include those that were taken off at the same point in the disassembly process and that match exactly.
When you reassemble the computer, simply reinstall the screws in the reverse order. Many people
will think that this is a waste of time but it's very annoying when you think you have the computer
properly reassembled only to find that you have extra screws. Sometimes a screw is critical to the
proper operation of the computer because it grounds part of the board or ensures that the heatsink
lays properly on the CPU.

129 | P a g e
When you use a photo to show where the various screws go, lighten the photo. It makes it easier to
see the marks on it. You can either print it and make the marks with a sharpie or save the photo with
a new name and mark the photo with MS Paint or your favorite software.

I recommend the type of container above because each of the slots can be latched closed. You may
be tempted to use ice trays but I can tell you from experience that it's not a good idea. A slight bump
to the tray and screws will be thrown farther than you'd imagine. This will generally result in the loss
of many of the screws.

During the disassembly, there will be numerous connectors to disconnect. Most are straight-forward.
The first is the simplest. You can see that the plug is white and the socket is beige. Pull up on the white
part. It's better to only apply force to the plastic but you can generally help by pulling 'gently' on the
wires as well. I strongly recommend only using your fingernails to help pull the connectors. If you use

130 | P a g e
pliers and they slip off of the top of the connector, they can VERY easily sheer off the wires. When
disassembling a laptop that you've never worked on before, it's important that you take your time and
don't force anything. Most pieces come off easily once all of the screws have been removed. If you
use too much force and are not careful, when the piece finally comes free, it could result in the tearing
of the various cables (ribbon cables are the most vulnerable when force is applied to them along a
sharp corner).

The following connector is for the video monitor. It's like the connector above but a bit larger. These
can take a bit more force to disconnect. If you must pull on the wires, pull straight up. If you pull at an
angle, you may apply too much force to the wires on the end of the connector which can cause it to
break. The break may not be visible (if part of the insulation was crimped into the terminal) and can
cause a lot of un-necessary problems.

131 | P a g e
This next connector is common on keyboards. It's a 2-part connector. The black part is stationary. The
white part slides up and down to release and lock the ribbon cable in the connector. To release the
cable, lift the lock about 1/16" and pull the ribbon free. When the lock is open, the ribbon should pull
free very easily. To reconnect the ribbon, make sure that the lock is in the up-position and insert the
ribbon. Then push the lock down until it clicks.

132 | P a g e
The next image shows the heatsink and fan for the CPU. I'd advise against touching this unless you
absolutely need to do so. The large gray part with the four brass posts is the heatsink. The two pipes
that go out and to the left are heat pipes and move the heat to the fins of the heatsink. Sometimes, a
laptop will begin to overheat, even when none of the vents (bottom and rear of the computer) are
blocked. This is sometimes due to excessive dust in the fins. The best way to clear this is with canned
air (compressed gas dusters). This generally works well enough but sometimes, you have to remove
the sink. When you remove the sink, you MUST replace the heatsink compound that goes between
the sink and the processor. If you try to re-use the original compound, the processor may overheat
and fail. the additional photos show the fan shroud and fan removes as well as the heatsink removed
from the processor.

133 | P a g e
In this last photo, you can clearly see the thermal interface material (turquoise/green). That would
have be completely removed and replaced. Virtually any thermal heatsink compound will work. Some
like the Arctic Silver 5 but it's not necessary to use it. If you were going to replace the processor, you
would release the processor by turning the screw on the socket counter-clockwise. After replacing the
processor, you would turn it clockwise to lock it in.

134 | P a g e
Activity 6

This activity requires that you create a decision tree that will assist a technician in troubleshooting
a standalone computer system that will not power on. You can simply use a word processing
application such as MS Word to create a simple flowchart, or you may use a specialist program such
as MS Visio to generate a chart.

Refer to the following graphic for guidelines on how to use simple flowcharting symbols.

135 | P a g e
Activity 6

Using your chosen tool, draw a decision tree for troubleshooting this problem.

136 | P a g e
Activity 6

137 | P a g e
Typical system problems and causes

The first part of this reading will present a range of typical faults that are likely to occur in computer
systems.

Boot-up time faults

Boot-up time faults are those faults that occur during the boot-up sequence. The boot-up sequence is
the first major process that occurs when a computer system is turned on. This is a critical stage as the
boot-up sequence is especially susceptible to faults, which might render the system unusable.
Generally boot-up faults can be caused by:
 POST failure POST or Power-On-Self-Test is an initial test that a computer system executes
automatically when turned on. The system uses POST to test its integrity, by ensuring that all
basic functions and components are free of faults. POST generally tests things such as CPU,
Mainboard, RAM, Hard Drives, Input Devices etc.
 Boot Device Failure If the device responsible for containing the Master Boot Record (MBR)
fails, the boot-up sequence fails. Generally, a hard disk drive contains the MBR.
 Operating System Failure If there is a major fault with the Operating System, the boot-up
sequence stops. Generally, OS faults are caused by misconfiguration, system files corruption,
hardware or software faults which did not appear during POST, compatibility problems etc.
 Minor Faults These faults are not severe enough to halt the boot-up process but they can
have an impact on the functionality of the system. For instance, a peripheral device that does
initialise correctly at boot time due to lack of proper device drivers installed.

Poor performance

Usually, poor performance does not have a critical impact on a system. In some cases though, the lack
of system resources can have severe enough consequences to stop certain functions. For example, a
severely congested network might not allow a user to log on to the network (some would argue this
is critical enough!), due to timeout errors.
The underlying cause of faults related to poor performance can be found in one or more of the
following:
 Not enough Random Access Memory This might result in certain applications not being able
to launch and function normally.
 Not enough Virtual Memory A system which does not have enough free disk space available
for a paging file (Virtual Memory), might not execute correctly, possibly halting or operating
very slowly.
 Slow Central Processing Unit (CPU) This problem won’t allow a system to deal with processing
requests in a timely fashion. Applications that require timely processing might not function
correctly, or time out. System will operate very slowly.
 Slow Network Network problems are difficult to solve since the possible sources of
bottlenecks can be many. Slow servers, slow hubs/switches/router, slow WAN links, slow
network cards, or simply congestion will cause slow networks. Slow networks can produce

138 | P a g e
timeout errors on applications, or delays. Slow networks generally produce a large number of
errors, requiring retransmissions, which in turn congest the network even further.
 Input/output (I/O) bottleneck Generally relates to slow hard disk drives. Historically, hard
disk drives have lagged behind in terms of bandwidth when compared to CPUs and RAM. A
system with little RAM will depend heavily on Virtual Memory (fake RAM on the hard disk
drive), increasing the demand for I/O from the hard drive, worsening an existing bottleneck.

Network faults

Network Faults are complex, and their source can be varied. Typical network faults can be related to:
 Performance Network congestion can be a significant problem. Generally, networks that are
very heavily used might experience performance issues and congestion. Poor design could
also be the cause of this.
 Errors Network errors can be caused by faulty equipment (i.e. faulty cabling, switch or even
Network Interface Card). Congestion can also be a source of errors, as retransmission requests
increase. Finally, misconfiguration can lead to significant error rates.
 Security Network Security faults can be complex and varied. The source can be found in
misconfiguration, hardware and software design flaws, documented and undocumented
bugs, vulnerabilities etc.

Software and hardware design flaws

Computer systems nowadays are incredibly complex. Take Windows, the operating system or even
Linux – millions of lines of code have been compiled into each of those systems. The chances of
something going wrong can be expected to increase in proportion with the complexity of the systems.
It is an accepted fact that new releases of operating systems and applications will experience some
‘teething’ problems. Often, products will not become reliable until at least several months after
release and after one or two service packs have been released.
Software applications can also be ‘buggy’ and usually benefit from the regular release of patches and
‘hot-fixes’. Needless to say, as with the OS, administrators are responsible for deploying these fixes.
Hardware design flaws are not uncommon either. Many hardware manufacturers (particularly
network equipment manufacturers), release updates for the products in the form of ROM patches
(usually called firmware), which can ‘flashed’ onto a device.

Compatibility faults
Compatibility refers to the ability of components (software or hardware) to function and interact
properly without faults. Compatible components are designed to certain standard or guideline so that
functionality (hence compatibility) is assured.
It seems unusual to think that someone would deploy or install incompatible components. Not so –
sometimes incompatibility is not known straight away, not event to the developers themselves, until
thorough field testing has been conducted.
Hardware compatibility issues do not arise as often as compatibility is easier to establish beforehand.
Plus, initially, devices will either work or not. With software, it is not as clear-cut. Software versioning

139 | P a g e
is a big problem, as some versions might introduce variations of some system files and libraries that
until deployed and fully tested, compatibility cannot be established. Everyone at some point has
encountered a compatibility problem, which does not manifest itself during installation, until a specific
circumstance is created.
Generally, compatibility can be ascertained by following certain guidelines:
 The manufacturer of the component vouches/discloses whether component is compatible.
(I.e. Built for Windows XP logo)
 Host system meets software and hardware requirements (i.e. Windows 98 SE and higher,
300Mhz Processor, 64MB RAM, 200MB Disk Space, DirectX 9)
 Components are known to support common technology. For instance, components support
Fast Ethernet and TCP/IP
 Careful scrutiny of product specification indicates compatibility.
 Contacting vendor to obtain further information.
 If unsure about compatibility, sometimes physical installation and configuration in testing
environment might ascertain compatibility. Not always advisable as physical damage to
hardware might result if devices aren’t compatible.
Compatibility issues can produce sporadic faults, particularly when compatibility cannot clearly be
established or construed. As usual, faults need to be assessed as per normal IT management policy to
assess criticality and determine appropriate action

System misconfiguration and corruption


System misconfiguration can be a significant problem. Generally, misconfiguration can lead to a
variety of faults:
 System Services not available
 Network function impaired or not available
 Applications might not work
 Devices might not function
 Poor Performance
 In the worst case, system is unusable.
Misconfiguration can be caused by lack of knowledge/experience from technical personnel, human
error, as a consequence of a failed process (for example, a system file becomes corrupt due to a disk
failure), malicious software or hacking/cracking. Corruption refers to system files/configuration
becoming unusable/unavailable due to a fault. Commonly corruption can occur due to:
 Hardware failure Failed hard disk, fault memory, faulty component.
 Buggy software Sometimes software, which has access to system files, may modify system
files without reason due to poor software engineering.

140 | P a g e
 Security compromise Malicious software; security breaches can deliberately modify system
configuration to render a system unusable.

Problem solving skills


There are several problem solving skills that a technician should endeavour to develop. This part of
this reading will enable the reader to develop an understanding for problem solving skills such as Fault
Tree Analysis (FTA), Hierarchical Task Analysis (HTA) and Cause and Effect. All of the mentioned
methods are supported by the scientific method introduced in the first learning pack.

Fault tree analysis

Fault tree analysis is usually done by using decision trees. Fault tree analysis is the process of analysing
a fault by using a decision tree. Decision trees can be constructed in advance, for common
troubleshooting tasks or they can be constructed ad-hoc for new faults.
Generally, decision trees are based on prior knowledge of the expected behaviour of computer system
components. For instance, a user may perform a specific task, causing an expected result or outcome.
A technician, would then analyse the outcome, and determine whether the result is what was
expected or not. Whichever way, the technician will be able to consult a decision tree, which indicates
a suggested course of action. The following example is a simple decision tree that would help a
technician to troubleshoot a fault for a user that cannot access his/her e-mail. Take a minute to
consider this decision tree.

141 | P a g e
User cannot
access network
(e-mail)

Ping e-mail server No Yes Ping by hostname


DHCP in Use
by hostname in same subnet

Yes Yes
Success? Success?

No No

Ping e-mail server Ping e-mail server


by IP address by IP address

Yes Yes
Success? Success?

No No
Reconfigu Reboot Release /
re WINS / and Logon Renew
DNS again DHCP

E-mail Yes
accessible?

No
Reconfigure Yes
IP Settings First Time?
and Reboot

No

Investigate possible
hardware/cabling/drivers
faults etc

Problem
Solved
Move on to
another decision
tree...

Figure: Example of decision tree


Decision trees are very helpful for first level troubleshooting. First level troubleshooting is usually done
by a help desk/support person with good knowledge of IT systems, but generally not regarded as an
expert. Decision trees are not helpful when faults are difficult and out of the ordinary – in this case an
expert may be engaged.

142 | P a g e
Hierarchical task analysis

Hierarchical Task Analysis (HTA) is another valuable skill that be employed for fault-finding purposes.
HTA is a logical representation of a process and steps that must occur for this process to begin and
finish successfully. The following diagram is an example of an HTA for the boot-up sequence of a typical
computer system.

143 | P a g e
Begin Boot up
process

System is
powered up

Halt
Successful
System

POST takes place

Halt
Successful
System

Locate Active
Drive with MBR

Halt
Successful
System

Execute
Bootloader

Halt
Successful
System

Find OS and
Begin Loading

Halt
Successful
System

Finish Loading OS
and Present User
Interface

Halt
Successful
System

Boot Process
Completed

Figure: Sample Hierarchical Task Analysis (HTA) diagram


This HTA shows how a boot up-sequence as expected to happen. The HTA generally is very simple – it
only shows a series of small tasks in sequential order that make up the bigger task or process.

144 | P a g e
The following steps are included in this sample:
1. Begin the boot up process
2. System is powered up, if successful, continue to next step, otherwise halt system
3. POST – Power On Self-Test takes place, if successful, continue to next step, otherwise halt
system
4. Locate Active Drive (typically hard disk drive) and MBR (Master Boot Record), if successful,
continue to next step, otherwise halt system
5. Execute Bootloader (or bootstrap) program, if successful, continue to next step, otherwise
halt system
6. Find Operating System and begin loading, if successful, continue to next step, otherwise halt
system
7. Finish Loading OS and present User Interface, if successful, continue to next step, otherwise
halt system
8. Boot-up sequence completed
HTA can be very helpful and may be used in conjunction with other tools such as decision trees, during
the fault finding process. The great thing about HTA diagrams is that they are simple to construct.
Clearly, a good knowledge of the system is required in order to understand what steps need to be
taken, to construct a HTA diagram. Due to their usefulness, HTA diagrams are not only used in IT, but
right across many fields of industry.

Cause and effect diagrams

Cause and effect is another method that can be used by troubleshooting technicians. Cause and effect
is a method which allows a technician to analyse the possible causes of faults (the undesired negative
effects). The Cause and Effect method is usually implemented by using Cause and Effect diagrams.
What Is a Cause-and-Effect Diagram? A graphic tool that helps identify, sort, and display possible
causes of a problem or quality characteristic. These diagrams sometimes are knows as fishbone
diagrams due to their shape.
What are the benefits of Cause and Effect diagrams?
 Helps determine root causes
 Encourages team participation
 Uses an orderly, easy-to-read format
 Indicates possible causes of variation
 Increases process knowledge
 Identifies areas for collecting data
The following sample is a general layout for Cause and Effect Diagrams.

145 | P a g e
CAUSE A CAUSE B

EFFECT

CAUSE C CAUSE D

Figure: Cause and Effect "fishbone" diagram


All HTA diagrams begin with the Effect (in our case the undesired [negative] effect or fault) being
stated as starting point. Through analysis and brainstorming, one can begin to add possible causes to
the resulting effect. In turn, each possible cause is analysed trying to work out the underlining
circumstances that might lead to the possible causes of problems. For example, if the [negative] effect
is that a user lost a file kept on a disk drive, one possible cause could be that the disk drive’s file system
experienced corruption – a further question must be asked: why did the file system corrupt itself? You
will see that fishbone diagrams grow in complexity, as each possible cause is further analysed. Have a
look at the following diagrams where the [negative] effect is ‘computer downtime’, and how each of
the potential causes are analysed to gain further insight into the problem.

Activity 7

This activity will require you to prepare an action plan for a given fault. The fault is described below.
You will need to formulate this plan in fairly generic terms since you would be working without
having had exposure to the system described.

The fault

You have been assigned to troubleshoot a network server. The server has been operational for over
18 months and has recently started to experience some problems. The symptoms described are as
follows:

 System hangs intermittently when accessing disk drives


 The Windows 2000 Event Log shows several entries relating to I/O and CRC errors
 The lights in front of the RAID enclosure sometimes blink continuously, even when disk
activity is nonexistent

146 | P a g e
Activity 7

You suspect that the RAID subsystem is failing.

How would you develop an action plan, which will enable you get to the bottom of this problem?

147 | P a g e
Rectify the problem

Formulate a solution or rectification

Once the source of a fault has been clearly identified, a solution must be formulated. Clearly, the
solution will normally depend on the nature of the fault. In general the following approaches can be
taken in order to reach a satisfactory outcome:
 Replace or fix component, whether hardware or software, that is known to be cause of the
fault
 Install software patches or hot fixes, provided by software manufacturer/developer
 Install or update ROM if possible as older equipment might not support ‘flashing’ ROM. These
are normally provided by equipment manufacturers.
 Adjust configuration, if miss-configuration is the cause of the fault
 Implement a workaround – Generally, a workaround is an acceptable solution when a fault
cannot be solved, or the solution is uneconomical, or would have an undesirable/unwanted
effect or negative impact on the system.

Implementing a solution

Depending on the fault and its cause, implementing a solution can be fairly straight forward. A
technician might be able to provide an instant solution to a trivial fault or to a fault that is regarded as
common. Experienced technicians are be able to quickly arrive to a solution based on well-known
symptoms. For instance, if a computer system has been infected by a well-known virus, the technician
should be able to take remedial action on the spot, by updating virus definitions, reconfiguring the
operating system, and deleting infected files.
When faults are more significant and complex (and possibly critical), planning is required. Sometimes
it is not possible or advisable that a technician attempts to fix a problem without planning and making
sure that the implications of remedial action are well understood. The following questions should be
answered:
1. Are replacement software and hardware components on hand?
2. Do software and hardware components need procuring? If yes, are funds available to procure
needed components?
3. Is a fix/patch available from manufacturer?
4. Is the impact of remedial action well understood? i.e. is downtime acceptable, potential loss
of revenue
5. Will impact to client be minimal?
6. Does the company have the skilled personnel that can fix the problem?
7. Should external help be sought?
8. Would a workaround be the preferred solution?

148 | P a g e
9. Has a rollback strategy been devised?
10. Is there any training, education or procedural changes required?
The next step to be taken after applying a fault fix is to perform testing. In support circles, this process
is named ‘acceptance testing’. An ‘Acceptance Test’ can be defined as a formal test conducted to
determine whether or not a system satisfies its acceptance criteria and to enable the customer to
determine whether or not to accept the system. Acceptance tests may also be known as Functional
Tests. In other words, acceptance testing allows a technician to ascertain whether the fault has been
truly fixed, and that the client has recognised the fault as fixed.

Planning the rectification process

You have probably already learned about the fault finding process in general from previous learning
packs. You learned about the scientific method for fault finding (cyclic method) and the necessary
steps that need to be taken in order to rectify a fault.
The scientific method proposes to use logical and systematic steps (procedures), to analyse available
information, such as symptoms, in the hope of finding information that is useful and relevant whilst
discarding what is not. This procedure will enable you to draw conclusions and hopefully arrive at the
source of the problem. Generally, the method is repeated (cyclic), until the source of the problem has
been identified.
The principles of the scientific method are summarised in the following steps:
1. Gather Information
2. State the Problem
3. Form a hypothesis
4. Test the hypothesis
5. Draw conclusions
6. Repeat when necessary
This scientific method underpins cyclic fault finding. You might remember that in Learning Pack
‘Obtaining Fault Finding Tools’, we described cyclic fault finding as featuring eight steps:
1. Define Fault
2. Gather Details
3. Determine Probable Cause for Fault
4. Create an Action Plan
5. Implement Action Plan
6. Observe Result
7. Repeat if needed

149 | P a g e
8. Document
This learning pack deals with the last 5 steps of the cyclic fault finding method—particularly with
creating and implementing actions plan. Ultimately, action plans are the instruments that enable us
to solve faults.

Activity 8

Explore online knowledge bases

In this activity you will need to use the Internet to search for examples of knowledge bases. You will
need to visit vendor sites, where you might find examples of faults that have been collated for the
public to view. The idea behind knowledge bases is to provide common knowledge about problems
that customers encountered before, the cause for these problems and possible resolutions or
workarounds.

You might want to visit some of the following sites, or you may want to do your own search for
knowledge bases.

 www.support.microsoft.com/search/ Microsoft Corp online knowledge base


 www.novell.com Look for Novell’s online knowledge base
 www.sunsolve.sun.com/ Sun Microsystems online knowledge base
 www.hp.com and follow the links to ‘Support and Troubleshooting’ HP’s online knowledge
base.

Did you find any examples of faults in any of the above knowledge bases?

150 | P a g e
Activity 8

Developing an Action Plan


In terms of fault resolution and rectification, action plans are the summary of steps to be taken in
order to solve a fault. In relation to fault finding, action plans needn’t be complicated or lengthy.
Instead, action plans simply outline the steps that will be taken to try to solve the fault.
As stated, action plans generally outline the needed steps which will be taken in an attempt to solve
a fault or problem. In many circumstances, the action plan is simply suggested by a fault finding tool
such as a decision tree. Decision trees, are essentially aimed at helping the trouble-shooter make
decisions and implement actions, depending on possible scenarios.
An action plan will generally have the following characteristics:
 Acknowledges the presence of a fault, providing justification for action to be taken
 Identifies the systems or components affected or impacted
 Identifies the objectives of the plan (i.e. restore optimum functionality)
 Identifies resources needed, including hardware, software, human resources and procedures

151 | P a g e
 Identifies severity and criticality, hence priority
 Identifies a timeframe for implementation, according to priority
 Identifies any support contracts that might exist and be applicable to the system in question
 Indicates actual remedial steps to be taken. This might include system reconfiguration, re-
installation, software patches, component replacement, consultation with vendors to engage
as needed
 Indicate risks including possible disruptions as a result of remedial action
 Identifies a workaround solution in case the previous steps failed to rectify the fault
The above example is a very comprehensive plan, with all the items that should be included in an
action plan. Keep in mind that in many cases, electronic change management systems will automate
many of these steps. Understandably, this is a good thing; otherwise technicians would spend a great
deal of their time formulating action plans.
Action plans are particularly important for faults that have a significant impact on a business as a
whole. Trivial faults, such as those considered routine, do not warrant a formal action plan. Routine
fault finding generally is performed ad-hoc; that is, technicians are able to solve common problems
assisted by historical data, knowledge bases and well documented procedures, without having to
resort to special action plans.

Minimum disruption to clients

Regardless of the nature and severity (impact) of a fault, technicians will strive to resolve problems
with minimum disruption to clients. Sometimes, disruption cannot be helped, as the fault itself is
disruptive enough; however, the remedial steps should be such that disruption is kept to an absolute
minimum.
Some of the strategies that could be adopted are summarised below:
 Identify the extent and impact of the fault. You need to know what has been affected by the
fault itself and not by the steps you have taken. You must know whether you have made things
worse, or whether the symptoms are from a fault you inadvertedly caused.
 When formulating an action plan, identify the most effective steps; that is, those actions that
would fix the fault and cause minimal disruption
 If systems are unusable, you might isolate them from the rest of the network for testing and
troubleshooting. This would avoid troubleshooting activity affecting working systems.
 Liaise with clients to find times that are convenient to them
 If you are dealing with critical infrastructure components such as servers, routers etc.; perform
your testing and changes outside business hours
 If components were isolated from network, be sure to fully test in lab/workshop before
reintegrating them. Systems with changed configurations might cause unexpected results and
new faults
 If running tests on a live network, understand the impact of these tests. Some tests can have
very negative effects on performance.

152 | P a g e
 Have a rollback or back-out plan

Planning for system rollback


Rolling back or backing out is fundamental to effective and efficient troubleshooting. Rollback and
back-out plans are the strategies that you might need to implement if things do not work out. If the
steps that you took as per your action plan weren’t effective and the fault is not resolved, you need
to take a step back or rollback. You must be able to restore the system to the previous state. If you
are not able to rollback, the situation could in fact, get worse.
If the modifications you introduced have not met the objectives stated by the action plan, then a
decision needs to be made about what, if anything should be done. If the fault is affecting users or
parts of the IT infrastructure adversely in new and different ways, a decision might be made to back
out the change and remove it from the production environment.
You also must consider some of the issues involved with rolling back a change:
 The amount of effort (time, resources etc.) required to perform the rollback
 The effect it might have on other (either planned or already deployed) changes.
 The possibility that users are already using the changed system, although not to the best
effect, and removing some functionality that the users have become accustomed to may be
worse than leaving it as is.
If you are in a position that you need to implement a rollback, and possibly implement some
emergency measures, you must think about the following questions:
 Has the problem been correctly analysed?
 Has the proposed remedy been adequately tested?
 Has the solution been correctly implemented?
When faced with a possible rollback, it might be better to provide a partial service in order to allow
the system to be thoroughly tested rather than to suspend the service temporarily, and then
implement the change.
Once the system has been successfully rolled back, you must return to the first step of the Cyclic Fault
Finding Method.

153 | P a g e
Activity 9

This activity will require you to devise a rollback strategy based on the scenario from the previous
activity. The fault is described below.

The fault

 You have been assigned to troubleshoot a network server. The server has been operational
for over 18 months and has recently started to experience some problems.
 The symptoms described are as follows:
 System hangs intermittently when accessing disk drives
 The Windows 2000 Event Log shows several entries relating to I/O and CRC errors
 The lights in front of the RAID enclosure sometimes blink continuously, even when disk
activity is nonexistent

You suspect that the RAID subsystem is failing.

How would you develop a rollback strategy for this situation?

154 | P a g e
Activity 9

Create a list of probable causes of the problem10


Creating a list of possible problems is a task of analysis.
Consideration of the following will assist the creation of the list:
Computer fails to start

The causes of computer failure to boot up can be broken down into four categories as follows:

 Bad electrical connection


 Power supply failure
 Operating system failure
 Hardware failure

Each of these categories holds a number of issues that computer fixers see on a very regular basis. The
most common issues are explained on this page.

In many cases, problems can be fixed and your computer up and running again in a short amount of
time.

10
Source: Online Learning for Sports Management, as at http://www.leoisaac.com/tec/main006.htm, as on
11th December, 2016.

155 | P a g e
Bad Electrical Connection

This is a very common causes of computer failure in older computers. The constant heating and cooling
of the computer, atmospheric conditions and dust can all play a part. Most of the time the issue can
be solved in just a few minutes.

This category includes:

 bad connection between a memory module and the motherboard; or


 bad connection between the video card and the motherboard; or
 loose cable

Memory modules need to be restated

It is a very common cause of boot failure that a memory module is not properly connecting with the
motherboard. If just one of the many pins on the module fail to connect in the motherboard slot, the
computer will not start. You will not hear the usual beep, nor will any text appear on the screen when
you try to boot. The computer will be lifeless when you turn it on except perhaps for the case and CPU
fans. Take the power cord out the back of your computer and open the case. Try taking out the
memory modules and the putting them back (with a bit of a shove). This is called reseating. Sometimes
you have to repeat the process a few times. The bad electrical contact occurs as a result of a gap
opening up, or a bit of corrosion or dust getting between the electrical contacts. Reseating works in
very many cases.

Video card needs to be restated

This is much the same problem as with the memory modules, except the video card need to be
restated. Of course, your computer may not have a separate video card. Furthermore, if you have a
laptop and it has a separate video card, then you can only get to the video card if you take the laptop
base apart. Laptops are really unfriendly in this matter. Unless you have experience, or you can find a
step by step guide to disassembling the laptop base, don't do it yourself.

Loose cable

Particularly if you have been moving a computer around, or you have opened the case for any reason,
it is perfectly possible that a cable is loose or there is not a good electrical connection. It is always
worth a try to recheck cables by removing and re-attaching each in turn. The cables to be mainly
concerned about are the ones that connect the hard drive to the motherboard and the cables from
the power supply unit that connect to the motherboard.

Power Supply Failure

Power supply unit is broken (desktop computers)

If your computer is completely dead when you turn on, that is no fans, no lights, nothing, your power
supply may have failed. This is not uncommon. Replacing a power supply is easy and not that

156 | P a g e
expensive. If you are a novice, just disconnect all power supply cables, unscrew the unit and take it to
a computer supplier. If you do this you will ensure that you get a new one that will have the connectors
you want and the right strength in output Watts.

Laptop Battery

If you have a laptop, check that the battery is connected properly. Remove the battery and reattach
with a bit of a hard shove in case there is an electrical contact problem. If this does not work, it might
be that the battery has no charge. Either the battery is failing or the supply of electricity to the battery
is failing. Take a look at the socket where the lead from the battery charger goes.

If there is a thin pin in the socket, is it very wobbly. It is not uncommon for someone to trip on the
battery charger lead and break the socket where the battery charger lead attaches. It is also possible
that the battery charger has died. You can purchase a universal laptop charger for around $60 and a
new battery does not cost that much if you order direct from China via EBay or Ali Express.

Operating System Failure

Generally, if a computer turns on, LEDs light up, the hard drive makes a few noises and some text
appears on the initial screen, then the probable reason for boot failure is the operating system.

Corrupt or missing file(s)

When the operating system fails to start and the hardware is not a fault, it will be because one or more
essential files that the operating system needs to start have come corrupt or missing. The two most
probable causes of this are:

1. The hard disk is old and beginning to fail and/or


2. An infestation of the computer with malicious software

If it is a matter of missing files or files that do not read, then running software such as Spin rite can
make a difference. Spin rite will often repair files and make them readable again. Following a scan and
repair by Spin rite, you may be lucky to get operating system to boot up again. If you are lucky, take
immediate steps to back up whole hard drive using good software such as Acronis True Image. This
software will create an "Image" of your hard drive which you must store on an external hard drive.
You can then purchase a new hard drive and transfer the image, however you will need a hard drive
docking bay. A docking bay in effect turns an ordinary hard drive into an external drive and therefore
information (the image) can be copied to it before it is installed into a computer.

Many people faced with this situation resort to reinstalling the operating system but this is often not
necessary. A good computer fixer/repairer may well be able to repair the missing or corrupt file
problem. If the problem is repairable, and it usually is, you don't have to start all over again reinstalling
every bit of software you had.

157 | P a g e
Attack of malicious software

Malicious software is a generic name of viruses, Trojans, worms and rootkits i.e. the nasties you need
to keep out of your computer. A word of warning, if it is the case that your operating system has been
damaged by malicious software, then often it is very difficult to make repairs and get the computer to
boot up again. It's not impossible, just difficult. Malicious software is so incredibly sophisticated these
days that even good proprietary Anti-Virus software cannot deal with it. If you scan a hard drive that
will not boot using a rescue disk provided by the makers of anti-virus software, it is likely that the
malicious software will be detected and repairs effected. However, the repairs may not be 100% and
the computer will still not boot. It is then necessary to find which system files in your computer have
been changed/corrupted by the malicious software and this can be very challenging to find and fix.

Hardware Failure

Video card failure

Not all computers have a separate video card. It is easy to tell if your computer does. If the monitor
lead connects to your computer in the same area where all the other leads plug in, then your computer
does not have a separate video card. On the other hand, if you monitor lead plus into your computer
in an area well away from other leads, the chances are that you do have a separate video card.

Video Card with Fan Video Card with Cooling Vanes

The trouble with video cards is that they often only have a life span of about three years. This is may
be because they get very hot. That's why they have a fan or cooling vanes. If the video card fails, your
computer may not start at all.

Fortunately, they are easy and inexpensive to replace. Most video cards cost around $40-80. However,
you could just try plugging in the monitor to other (not in use) video monitor port (see yellow arrow).

158 | P a g e
If this works, then you know for sure that the additional video card is broken. But you can use the
video component that is usually built into the motherboard but the picture may be less quality.

If there is no other place you can plug in your video monitor lead, then you will need to replace the
video card.

Hard drive failure

Sometimes hard disk drives just break. If you hear clicking noises, then this is a likely scenario. This is
a bad scenario! Mechanical failure of the hard drive usually spells the end of all your data on the disk.
The only way that data can be salvaged involves taking the disk apart and this can only be done by
specialists and it will cost very large sums of money.

Motherboard Failure

Working out that a motherboard has failed is really a process of elimination. If it is not the video card,
power supply, memory modules, hard drive, etc., then the hardware failure is presumed to be in the
motherboard.

For laptops, a failure of the motherboard often spells the death of the computer. This is due to the
high cost of replacement. A new motherboard might only cost $100-150 if you buy overseas via eBay
or Ali Express, but it takes a good computer fixer about 1½hours to completely disassemble the laptop
base, change motherboard and reassemble the base. The labour cost could therefore comfortably add
another $100-$150. Still, if it is a good computer, many people may opt for this cost rather than the
cost of a new computer.

Why is my computer so slow?

The number one reason why computers get slower and slower with each passing year is that users
tend to install more and more software. Much of this software is unnecessary, provides little or no
benefit and simply clogs up the computer. Probably the most compelling reason why the user installs
the software is that it is FREE, but this is not a good reason at all.

There is one distinction that needs to be made before any further discussion of software takes place.
In order to keep your computer running at an optimum speed, you need to be concerned about the

159 | P a g e
software that automatically loads as soon as your computer starts. This type of software loads into
the background, unseen, and takes up valuable computer resources. If, on the other hand, the
software does not load at start-up but is accessible via your programs menu, then it is much less of an
issue.

Let's look at some of the typical software that causes the problem.

Toolbars

Everyone wants to give you a toolbar these days that purports to be useful. Toolbars are unnecessary
and largely not very helpful. They load into the background when your computer starts or when your
start-up your Internet Browser, and therefore take up system resources. You can uninstall Toolbars
safely by going to "Add Remove Programs" in your Control Panel. Typical unnecessary toolbars include
Alot, Inbox, and the most notorious MyWeb Search. Ideally, it is best to have nust one or two toolbars.

Software that comes with your Printer

When you purchase a printer you will generally receive a CD. When you load this into your CD Rom
Drive you will generally install a software suite that provides you with capability to edit photos and
create a photo album. This all sounds really nice. The problem is that this software loads components
into the background of your operating system at start-up, just in case you happen to plug your camera
in, or use your scanner if you have one. Of course, most of the time this software just sits there unused
but still taking up valuable system resources. Really all you need when you buy a printer is to install
the printer driver, a small piece of software that enables your computer to communicate with the
printer and to be able to print.

Software that come with your Mobile Phone

A phone is not just a phone anymore. It's a device for playing music, videos, sending and receiving
emails, instant messaging, taking pictures, and searching the web. The software that comes with your
phone enables you to attach your phone to your computer and store music on your phone, manage
your emails and download pictures you have taken. This capability for some people is for enjoyable
and useful, but others may have limited or no value. If you hardly or never use this software and yet
it loads into the background of your computer on start-up, you may wish to uninstall it.

Instant Messaging

Microsoft Messenger and other similar instant messaging software has an annoying habit of loading
on start-up and, if configured, will automatically sign you on. This announces to the world that you
have just sat down at your computer and can now be messaged. However this is yet another piece of
software that just sits there in the background of your operating system, whether you are using it or
not. You can, of course, right click on the icon in the bottom right hand corner of your screen and exit
messenger.

160 | P a g e
Other software

Other software that typically loads into the background and provides benefit for only a fraction of the
time that you use a computer include:

 Apple QuickTime
 Adobe Update Manager
 Audio Mixers and Managers
 Java
 Power DVD

Malware, Spyware, Adware

All the above software is legitimate but there is also the very real possibility that your computer loads
other nasty software at start-up. Unfortunately, there are just too many websites that will infect your
computer with "Trojans". This is a form of software that can unleash really bad consequences for the
computer user. You should read what Wikipedia says about Trojans. Click here

If this happens to you, you will really know it! Suddenly your computer behaves like you have no
control any more.

So what do I do?

Stopping legitimate software from loading at start-up can be accomplished by using the configuration
utility call MSCONFIG. If you have Windows XP, click on the START button, and then on "RUN". Type
MSCONFIG and press the ENTER key. When the configuration utility opens you will see that there are
seven (7) tabs - General, SYSTEM.INI, WIN.INI, BOOT.INI, Services, Start-up, Tools. Click on the
STARTUP tab. You will then see a list of software that loads at start-up. A program can be stopped
easily by unchecking it.

If you have bad stuff i.e. Trojans, Viruses and Worms, then the above solution won't help you. You will
need to download and run the "Windows Malicious Software Removal Tool" as a starting strategy. It
does not get rid of everything but it will take care of many of the common Trojans. You also need to
run a full scan of your computer with good Anti-Virus software and you may also need to download
and use additional software such as Sophos Anti-Rootkit.

At the end of the day this may be all too much or too scary for you. This is when you need to turn to
people with expertise. So if you are totally frustrated with your computer, don't throw it out, it can be
fixed.

161 | P a g e
Computer Freezes

What causes a computer to freeze?

It could be just one of those annoying things with computers that they very occasionally freeze. The
first thing you notice is that your mouse pointer will not move and then, to your disappointment, you
find that striking any key on the keyboard has no effect either.

Under these circumstances, all you can do is to shut down the computer by holding down the "on
button" on the computer for 5 or more seconds.

When your computer restarts you may come to a black screen with white writing and be asked the
question whether you want to start in "Safe Mode" or "Start the Computer Normally". Choose "Start
the Computer Normally". If computer continues to freeze after it has started up, then there is a
problem. Otherwise, you may find the computer works perfectly for many months before any repeat
of the problem.

Hard Disk Errors

Your computer hard disk has a limited life span and this may range from 3-6 years depending on the
frequency and amount of usage. Be warned, however, that hard disks occasionally fail before reaching
3 years of age and so you should always backup your important data.

As the hard disk reaches the end of its life it may begin to experience issues such as failing to read and
write data accurately, overheating, and mechanical failure.

Your computer may freeze if your hard disk fails to read data correctly. Such a fault may be
intermittent i.e. sometimes the data is read correctly and sometimes it is not. If your computer needs
some data that is critical to the operating system, and it cannot read it, then your computer will likely
come to a grinding halt.

If you hard disk is getting old, the best thing you can do is to create an image of the hard drive and
copy this image to a new hard drive (before it is too late). You need special software such as Acronis
True Image to do this.

Computer Electrical Problems

All it takes for your computer to freeze is a momentary loss of electricity either to the whole computer
or to an individual component inside. If you are having shut downs and freezes, the first thing you
should check is whether the power lead to your computer is plugged in properly to the computer or
the power board. It is really easy to dislodge a power lead with your foot.

You should also suspect that the power board itself could be faulty.

162 | P a g e
Very commonly, with old computers, the issue is inside the computer. If there is a bad connection
between your memory modules and your motherboard, or between your video card and your
motherboard, then a momentary loss of electricity flow could occur. In either case, reseating the
memory modules or video card cures the problem in a large percentage of cases.

Computer Hardware Malfunction

A typical scenario in aging computers is that video card (if you have one) becomes faulty. Although,
video cards have cooling fans or cooling vanes, they can become hot if the fan or vanes become
clogged with dust. This is one good reason to clean your computer yearly. Excessive heat is an enemy
of your computer and component failure can result.

The power supply unit is another component that can fail but usually, if it does, the problem is that
computer won't start at all. Nevertheless, the power supply unit may be the cause of the problem.

It is also possible that your motherboard may have a faulty component. This is a hard one to be sure
about. If you keep getting freezes and you have eliminated everything else, then it may be your
motherboard that is the problem. If you have a desktop computer, the experience computer repairer
can install a new motherboard in about 30 minutes if it comes to this. However if you have a laptop,
the job may take 1½-3 hours. Furthermore, laptop motherboards are usually more expensive than
desktop motherboards.

How to Fix Malicious Software

What is malicious software?

Malicious software, or malware for short, is a term given to any kind of software that infiltrates onto
your computer and subverts your control.

For instance, malware may:

 Utilise email addresses that you have stored to send out spam emails without your knowledge
 Compromise you security systems and allow outsiders entry into your computer
 Steal information from you, e.g. passwords, and send this information to unauthorised people
 Disable your anti-virus software
 Masquerade as anti-virus software and attempt to extort money from you to fix problems that
you don't have
 Monitor your use of the internet
 Display pop-up advertising on your computer

Basically malware is cybercrime, a field of crime in which victim and perpetrator can be separated by
thousands of kilometres.

It is a constant battle these days to avoid infections of malicious software. The majority of people have
insufficient knowledge or understanding of the problem. Generally, people are not aware of the
dangers and have little idea of the degree of sophistication of malicious software.

163 | P a g e
Most computer users will rely on purchasing and installing anti-virus software as a defence against
malware but that's as far as it goes. However, to combat the problem computer users must also ensure
that:

 anti-virus software is updated daily


 the anti-virus software is used to scan the computer for infections at least weekly
 the computer's operating system is updated with the latest security updates daily
 some judgement is used in visiting web sites that could be risky
 some judgement is used before opening attachments to email

What's the purpose of malware

Generally, the purpose of malicious software is to make money. This may be achieved for the
perpetrator of the crime through the delivery of unsolicited pop-up advertising, stealing information
that could be used in fraudulent transactions, inviting you to pay money to fix the actual malware
infection or inviting you to part with money on false pretences.

Damage caused by malware

Malware can be incredibly sophisticated these days. It often employs stealth tactics to evade detection
and often has the ability to compromise anti-virus systems and render them useless. It can change
your computer settings so as to hide your personal files and folders or it may hijack your Internet
browser so that it takes you to websites designated by the malware maker.

Removing Malware

Certain types of malware can be extremely difficult to remove from your computer, even by experts.
Although recognised brands of anti-malware software such as Norton, Kaspersky, McAfee, Trend
Micro report that the problems are fully resolved, you can never be fully sure. That's why in the
corporate world computers used by staff are just reimaged at the first signs of trouble as all data is
stored on network drives rather than the staff computer.

Furthermore it is often the case that recognised brands of anti-malware will find and remove
components of malware but then tell you that the problem is not fully resolved. Basically it's a
battleground where criminal work around the clock to find new solutions to bypass and subvert
computer security systems. (see Recommended Free Anti-Malware Software below)

Evaluating the Risk

Basically, it is all about risk. Even before you have an attack of malware, you must evaluate the risk.
The most significant risk is that criminals might access your passwords and login details to your
Internet Banking or highly sensitive personal and private information such as Credit Card details.
Identity theft is an awful crime for those affected.

164 | P a g e
The victim is left in a position of having to explain to various authorities how they have lost money, or
their identity has been used for fraudulent transactions. Usually however, investigating authorities
can usually quickly determine how crime was perpetrated and absolve the victim of any wrong doing.

Other risks include the possibility that your computer is being accessed by criminals and used as a
shield for crimes other than identity theft. For instance, when a government suffers a 'denial of service
attack' as a result of being bombarded by thousands of computers simultaneously across the world,
one of these computers could be yours!

Generally, however, losing data such as years of photos, music and documents on your computer is
not the main risk. Even though a computer might suffer a completely disabling malware attack, it is
nearly always possible for a computer fixer to recover and save data. Criminals want to take control
of your computer but not to the extent that you are forced to take drastic action such as reinstalling
the operating system because this puts them out of business on your computer.

Risks have two major factors, probability and severity. On both these factors, you need to take action.
The probability of an attack of malware is high, and potential consequences (severity) of such an
occurrence can be, for some people, devastating.

You can, of course, lower your risk by installing the very best anti-virus / anti-malware and by avoiding
certain well-known traps. This reduces the probability of the risk but does not remove it. For people
with significant sums of money in bank accounts, it is necessary to take other measures, outside of
your computer systems, to protect your money.

How does my computer get infected with viruses?

The degree of sophistication used in Cybercrime these day is both extremely interesting and alarming.
The average computer user really needs to take some interest in what is going on, and to take steps
to reduce the possibility of being a victim.

It is a costly business at every level. Governments and corporate entities pay large dollars to secure
online systems from Cybercrime, and occasionally security systems fail. For individuals, the cost can
range from paying $80-100 per year for anti-virus protection to several hundred dollars paid to
computer repairers to recover damaged computer operating systems and lost personal files.

Although, the most worrisome risk is identity theft and the stealing of passwords for online banking,
fortunately individuals who fall victims to fraud, theft and other cybercrime are given a measure of
relief from banks and financial institutions. It never makes headline news, but when a case is proven
that an individual has had their bank account emptied as a result of Cybercrime, the financial
institution will normally replace the money and nothing more is said. It is bad for business to discuss
such incidents.

So how does it happen? Well, out there in cyberspace, there lurks small snippets of malicious software
in every corner of the Internet. Accidentally download this software to your computer without
sufficient protection, and you are DONE!

165 | P a g e
The next time you download pirated music, videos or software, could be the next time you get
infected. The file you try to download might seem to be that piece of music you have been looking for,
but the file could contain some extra 'code' that opens a backdoor to your computer. Once deposited
on your computer, it dials home and more malicious software is downloaded. The degree of
sophistication is such that it can lie on your hard disk dormant until a given date, or it uses quite
excellent stealth strategies to hide itself from your anti-virus software. Yes, that's right, having anti-
virus software installed does not give 100% protection, especially if it is FREE!

Using any file swapping (known as peer-2-peer networking) to download pirated movies and music is
a real risk - just don't do it! It is illegal anyway, as it transgresses copyright law,

Perhaps a method even more likely to get your computer infected is to visit Porn websites, or websites
set up to look like Porn sites but really exist to infect your computer. Of course, not too many people
are going to freely mention their computer became infected as a result of searching for Porn! If you
have teenagers in the house, you had better get some strong protection. It is almost guaranteed you
will get infected.

Email is still a route for Cybercrime, and a very good one at that. Sooner or later one of your friends
or family members will send you an email with a dangerous attachment. If you should try to open this
attachment, you will be infected. Often the subject of the email will be something funny to entice you
to open the attachment. You need to be aware that attachments with any of the following extensions,
DO NOT OPEN:

 .EXE
 .COM
 .BAT
 .PIF

Protection against viruses

All computer uses need to learn the basic steps for protecting a computer against an attach of
malicious software (virus, Trojan, worm, rootkit).

What you need to learn is:

1. Ensure that your operating system is downloading and installing updates regularly. If a
computer falls behind with updates, you will most definitely be more susceptible to attack.
2. Install quality anti-virus and internet protection software. As a computer repairer, I see a lot
of computers brought to me for virus removal that have free anti-virus software installed.
3. Learn how look and make a judgment of whether a website that comes up in Google search is
likely to be safe. One customer could not understand why every time they searched for their
bank using google, they kept going to a peculiar website and getting infected. Unfortunately
they did not read well enough the search results in Google. If they had done so, they would
have seen that the URL was nothing like their bank's URL.

166 | P a g e
4. Avoid installing free software just because a notice pops up on your computer when you
search the internet and begs you to install, purporting to offer so many benefits all completely
free. Do get fooled, free software usually comes at a price sooner or later.
5. Be careful about lending your computer to other people, or letting them use your computer
without supervision, unless you feel sure they can be trusted not to do something that you
wouldn't like. When someone brings me a computer to remove malicious software, I often
ask them 'Do you have teenagers in the house?"

What anti-virus software?

Well I have been using AVG Internet Security (not the free version) for a while now, and so far my
computer is trouble free. I guess I also recommend Norton Anti-virus, their reputation is well deserved.
There are other good brands such as Kaspersky Labs and perhaps McAfee.

Whatever, anti-virus software you install, it is not much good if it does not update daily. You need to
make sure it does.

Warnings from the computer

Messages that your computer gives you on your screen should always be read and considered. I guess
we have all fallen victim to the computer message that says "Do you want to save your document?"
and, without a moment’s thought, we cancelled the message to find we just lost an hour’s work!

Occasionally your computer will give you unexpected messages in an effort to warn you that your
computer has a problem! These message may be difficult to understand as they are not written in
plain language. They may also be quite lengthy and this inhibits people from reading them or writing
them down.

As a general rule any time an unexpected or unusual notice is displayed on your screen, you should
STOP, CONSIDER and WRITE DOWN the message. If you do this, and your computer is developing a
major issue, you are a long way towards solving the issue or at least providing really important
information to a computer fixer.

Here are some warnings that you might see and should definitely consider carefully:

 "System has recovered from a serious error"


 "Immediately back up your data and replace your hard disk drive. A failure may be imminent.
Press F1 to continue"
 "You are running out of disk space on Local Disk [drive]."
 "This site may harm your computer"
 "This type of file could harm your computer"
 "The directory or file is corrupt and unreadable"

This list of important computer warnings is by no means complete. Furthermore, not all warnings on
your screen spell gloom and doom.

167 | P a g e
If you don’t know whether a computer warning is major or minor, then take the time to look it up on
the Internet. Search for the error message on the Internet by typing the message exactly as written
on your screen. If the error message is very long, type just the first 6-8 words exactly as written on
your screen.

Information about errors on the Internet

It will be the case that many of the pages on the internet that you find about the error will be "forums".
These are sites were people can create a login and share ideas. These sites are often used by people
either seeking help with computers or providing help with computers. It is often the case that you
have to sift through many useless comments, or pleas for help, before you find some golden
information. Some degree of patience is required to sift though not only one page, but several pages
on several sites. You won't always be lucky but the majority of the time you can glean enough
information to make your search worthwhile. Even if you just get the gist of whether the error is minor
or serious.

Taking action on error messages

It should never be presumed that every time you press the 'on button' of your computer, it will boot
up without fail. Most computer users will have experienced that sinking feeling when one day, your
computer does not boot into your operating system (Windows, MAC OS). If you get a warning from
your computer, it could be that the next time you try to start your computer, it will fail. It is therefore
necessary to (a) research error messages and (b) backup your files. You should be backing up regularly
but, in reality, most people do not. It's a combination of lack of skills, lack of awareness of risks and
laziness. If you regularly back up your files, the consequences of computer failure are never that bad.
You might be put to some expense and inconvenience but your irreplaceable libraries of family photos
and important documents are saved.

Another action you can take is discuss your computer error message with your friendly neighbourhood
computer man. You need somebody who has good experience in fixing computers not the whizz kid
who thinks he's good with computers.

The sort of advice that you might get is to replace your hard disk drive, before it’s too late. Some types
of error messages are caused by old and failing disk drives. For an experienced computer fixer, it is an
easy job to 'clone' your hard drive to a new hard drive but only while the operating system still works.
So the earlier you take action the better. The cost of replacing a hard disk before it becomes a major
problem will always be much less that the cost of replacing it after it has failed.

168 | P a g e
Hard Drive Failure

What is the hard drive?

The hard drive or hard disk drive is a large component in your


computer that stores all your user software and all the files you
create. There are two sizes predominantly - the 3½" size for
desktop computers and the 2½" size for notebook computers. For
the average user, a new hard drive is expensive at around 10% of
the total cost of the computer.

For more information see also "How a hard drive works".

Why do hard drives fail?

Unfortunately hard disk drives have a limited life span. Somewhere between 3-5 years is the normal
life span of a hard drive, depending on how much work it does on a day to day basis. Environmental
conditions also play a significant part in determining the life duration of the hard drive.

Principal causes of failure of hard drives include:

 Heat (a build-up of excessive heat especially in notebook computers)


 Bumps and vibration (particular in regard to notebook computers)
 Wear and tear / metal fatigue / old age of components
 Electrical storm / power surge

As a computer user you must take account of the fact that a hard drive has a limited life span. It is very
likely that one day your hard drive will stop working without warning. If this is due to mechanical or
electronic failure of the hard drive, you will lose all the information on it unless you have made a back-
up.

Signs of hard drive failure

If you are lucky, you may get a warning. You may get a message from the computer saying that "hard
disk failure is imminent". If this happens, then you must take immediate steps to back up your user
files and seek technical assistance.

Other signs of possible hard drive failure include:

 messages from the computer that files are corrupt or missing


 the computer goes into error checking shortly after start-up before booting into Windows
 several computer crashes over a period of days or weeks
 computer fails to boot up
 computer reports that it has "recovered from a serious error"

169 | P a g e
What to do if signs of hard drive failure occur?

The very best thing you can do if you think there are signs of hard disk failure is to create a full image
of your hard drive. There are two ways of doing this but you will likely need an experienced computer
fixer to carry out the following.

Method 1: You can install software such as Acronis True Image on the computer to create the full
image of the hard drive and then attach an external hard drive on which the image is stored.

Method 2: Take the hard drive out of the computer and attach it to another computer that has Acronis
True Image installed already by using a hard drive docking bay. Then use the image creation software
to create and store the image either on the second computer or an external hard drive attached to it.

You need to create this full image while your operating system still boots up. If you can do this, you
have survived. You can purchase a new hard drive to which the image can then be transferred.

If your computer no longer boots up but the hard drive appears to be still working, then repairs to the
operating can be attempted before creating an image. If the repairs to the operating system are
successful then create the image as soon as you can.

If the hard drive appears to be working but repairs to the operating system fail, then the best strategy
is to copy the user files to a new location as quickly as possible before the hard drive stops working
completely. When you purchase a new hard drive you will need to start from scratch and reinstall the
operating system and all the software you previously had on the old hard disk. Then put back all your
user files which you saved from old hard drive.

If the hard drive appears to be no longer working i.e. it makes clicks or other funny noises, then
basically you have lost your data. While it is possible to take the disks (platters) out an put them in
another identical hard disk, this procedure requires a dustless environment, special tools, lots of
knowledge and a steady hand. In reality, this is not something that computer fixers do, and the cost
could be in the 1000's if you can find the appropriate experts.

170 | P a g e
Activity 10

In this activity you will develop a system support log. You will be required to create a small
application using either a database package or a spreadsheet application.

 Create a simple system support log by developing a small database application. You may
use a product such as Microsoft Access (any version) or Microsoft Excel (any version).
 The system support log should accept at least 7 data items, such as date, time, requestor,
fault description.
 If using a database application you should develop a simple form for data entry.

What data items did you include in your system support log?

171 | P a g e
Activity 10

Test the system to ensure the problem has been solved and record results

Acceptance tests
An ‘Acceptance Test’ can be defined as a formal test conducted to determine whether or not a system
satisfies its acceptance criteria and to enable the customer to determine whether or not to accept the
system. Acceptance tests may also be knows as Functional Tests.
Acceptance tests are common when commissioning new systems, system upgrades or when
significant changes and enhancements are implemented.
Acceptance testing in relation to fault finding and rectification is not as detailed and comprehensive
as when, for instance, implementing a new corporate application. However, acceptance testing is still
necessary in order to close fault finding cases. If an acceptance test fails, then the problem still exists
as it did before, or some functionality has not been fully restored.
In general, acceptance testing aims at
 Designing and building an accurate test environment that models the conditions in production
 Performing user acceptance tests
 Performing controlled pilot testing in the production environment where necessary (if
applicable)
 Evaluating acceptance testing results to make a valid decision to move toward declaring a fault
resolved

172 | P a g e
The development of an Acceptance Test involves a number of iterative steps:
1. Assess the type of testing required
2. Develop the procedures and instructions for testing
3. Develop the necessary test scripts
4. Execute the test scripts
5. Report any defects
6. Retest any fixes
Note: test script refers to the series of steps to be taken during testing, and not programme scripts,
such as Perl or Java script—although these could well form part of the testing process.
Acceptance testing addresses step 6 – ‘Observe Result’ (test) of the cyclic fault finding method. You
may formulate a test plan by carefully following the 6 steps outlined in Planning the Rectification
Process.

Acceptance Test Criteria

Acceptance test criteria refer to what things should be considered to determine fault resolution,
correct operation or expected functionality. In relation to fault finding, the criteria might be very
simple: the system works as expected! However, acceptance testing goes beyond the basics by
formalising the process and getting the user to acknowledge that the fault has been fully fixed.
Alternatively, the user acknowledges that satisfactory action has been taken to provide an optimum
alternative or workaround.
Criteria are often referred to as metrics. Metrics are statistical information or values that are used as
evidence for evaluating the performance of a system.
Imagine that a user could not access the network and logged a call with the company’s help desk.
Remedial action is taken and the following criteria used to ascertain acceptance:
1. User able to log in to network
2. User able to access e-mail
3. User able to access shared files and printers
4. User able to access Internet
5. User able to access corporate applications such as corporate database reporting systems
6. Performance when accessing network as expected
7. System logs do not report errors
8. Monitoring software does not indicate faults or network errors
9. All of the above are true all the time
Quite often, users will be satisfied as soon as they realise that functionality has been re-established.
Nonetheless, acceptance criteria are important so that faults are fully resolved the first time and do
not re-occur.

173 | P a g e
Help desk operations might go as far as developing acceptance criteria as standard procedures for
dealing with the rectification of routine (common) faults.
The implementation of acceptance testing will ultimately enhance the efficiency and effectiveness of
a support operation.

Activity 11

This activity will require you to devise an acceptance test procedure based on the scenario from the
previous activity. The fault is described below.

The fault

 You have been assigned to troubleshoot a network server. The server has been operational
for over 18 months and has recently started to experience some problems.

The symptoms described are as follows:

 System hangs intermittently when accessing disk drives


 The Windows 2000 Event Log shows several entries relating to I/O and CRC errors
 The lights in front of the RAID enclosure sometimes blink continuously, even when disk
activity is nonexistent

You suspect that the RAID subsystem is failing.

How would you develop an acceptance test procedure?

174 | P a g e
Activity 11

175 | P a g e
Identify and implement common preventative maintenance techniques to
support ongoing maintenance strategies11
How to Maintain Computer Hardware :

The first thing we should do is perform maintenance actions on components of computer hardware.
Here I usually do a couple of treatments such as:

Putting computers in place with good air circulation flow. It is very necessary to avoid overheating the
computer hardware.

Cleaning the dust regularly in and attached to the CPU casing role in hardware components. Cleaning
can be done with the help of a brush and a vacuum cleaner or better yet use a wind from the
compressor so that the cleaning of dust can be maximized. This action is performed several bilan once
depending on the condition of each computer.

Using a Stabilizer that the electricity that goes into the Power Supply is always maintained its stability.
Or better yet, if you use a UPS (Uninteruptable Power Supply) that can keep your computer from losing
electrical power suddenly.

Especially for my laptop computer types have written to care for these laptops in article 9 Tips on How
to Maintain Laptop.

How to Maintain Computer Software

The second thing that is not less important of course is to maintain software or computer operating
system. There are several computer software maintenance actions that we can do are as follows:

Backing up your computer system.

Backing up your computer system is intended as a preventive maintenance action, so that we more
easily recover the system when the damage occurred. To back up this system I usually use the help of
software Macrium Reflec Free Edition, EASEUS ToDo Backup or simply by running the System Restore
utility (Create Restore Point) before and after installing an application.

Installing and Running Computer with Antivirus Scan.

Install an antivirus and computer with Antivirus Scan done periodically. Here I usually use Avast or

11
Source: Catatan Mahasiswa, as at http://untileted.blogspot.com.au/2012/12/cara-merawat-hardware-
komputer.html, as on 11th December, 2016.

176 | P a g e
AVG Free Antivirus which proved powerful enough to protect your computer from virus infection. Also
use also other such Sanner Malware Malwarebyte or Norman Malware Cleaner to ensure your
computer is free from viruses and other malware.

1. Do a Windows Update, Update Antivirus and other software.


Automatically or manually, go to the Windows update, antivirus and other software such as
MS Office, Browser, Java and other applications, so in addition to closing the security hole
can also use the latest features of the various applications.

2. Perform Optimization System.

To perform system optimization, we can use some default Windows tools such as the following:

o Disk Clean Up, for cleaning files that are not needed.
o Check Disk, to check and repair the file system on the hard disk.
o Disk Defragmenter, to perform disk optimization by rearranging the files are
"scattered" on the hard drive.

In addition to using the default Windows tool, for easier maintenance and optimization of the system
we can also use the software for windows optimization like Advance System Care or Glary Utilities.

Cleaning Your Computer

You may have arrived at this page because you want to:

1. Physically clean the computer i.e. dust


2. Remove rubbish files and infections on your hard drive.

Cleaning Desktops of Dust

The effect of this dust is that air vents clog up restricting air flow. Temperature inside the computer
may rise and the computer may overheat damaging components and causing software freezes and
dropouts.

177 | P a g e
If you have a desktop computer, you
should once a year take a look inside
and if a dust build up exists, remove
the dust with a vacuum cleaner
(carefully of course). You should
particularly inspect the fan that sits
on top of the CPU for dust build up
(see picture below). You might have
already noticed that the CPU fan
seems to run faster and louder on
hotter days than it did before. This could be because the dust in the cooling vanes is restricting air
flow and the fan is not as effective in keeping your CPU cool. There are three ways to clean the dust
off the CPU:

1. You can detach the fan from the cooling vanes that sit on top of the CPU and use an old
tooth brush or some other small brush to gently brush away dust and use vacuum cleaner
to remove.
2. You can purchase a can of compressed air and blow the dust out. This is a more expensive
solution as compressed air in a can is not cheap.
3. You can leave the fan on top of the CPU cooling vane, but pick at the dust clogging it with a
tooth pick, or a very small screwdriver, or some other small sharp tool. Then as you dislodge
the clumps of dust, just vacuum it away. This "operation" is done through the blades of the
fan. You can get most of the dust away using this method which is quite quick and no
expense.

The picture to the right shows the build of dust in


the heat sink which sits on top of the CPU. If this heat
sink clogs badly with dust, the CPU may overheat
and be damaged.

Cleaning Laptops of Dust

If you have a laptop, it is not a good idea to


disassemble in order to remove the dust. In this case
try to dislodge the dust from around the fan by using
a can of compressed air, usually costs around $10.
Yes, fairly expensive, but much cheaper than a dead
computer.

178 | P a g e
Cleaning the rubbish on your computer

Over time, your computer will accumulate a significant amount of "rubbish" that serves no purpose
and possible affects the performance of your computer. This rubbish includes:

 Programs (software) accidentally installed or no longer required


 Malicious software i.e. Trojans, viruses, worms, rootkits.
 Temporary files downloaded from the internet
 The Recycle Bin (files you have deleted)

Most computer users only take action when something goes wrong. However, everyone knows that
preventative maintenance is the way to go. Here is a quick overview of what you should do.

Programs, Unwanted or Accidentally Installed

All over the internet their web pages touting the use of software that purports to clean your computer
of rubbish e.g. registrar errors, spyware, viruses, etc. Computer users are persuaded or fooled to install
such software and it is not uncommon for a computer to have 3 or more.

It is better to avoid installing any such software and instead to have one reputable system such as
Norton 360. Other types of unwanted/un useful software include games, uninstallers, web search,
toolbars for your browser, and more.

Generally most of this software can be easily uninstalled.

You will need to go into the Control Panel and then find "Programs and Features" (Window 7) and
then look down the list of software. Be careful you identify the right software to uninstall. If you right-
click on the software, you will see "Uninstall". Select this and follow the prompts.

Cleaning the Recycle Bin and Temporary Internet Files

Disk Clean-up is a standard Windows utility that assists you free up space on your hard disk. Disk Clean-
up removes temporary Internet files, removing installed components and programs that you no longer
use, and emptying the Recycle Bin.

Go to your Windows "Start" button in bottom left corner, then click as follows:

179 | P a g e
Step 1

The Disk Clean-up utility software will take several minutes to work out what files can be deleted.
Sometimes Disk Clean-up may take a long time. Do not be perturbed but just work it out.

Step 2

When Disk Clean-up has finished working out what files it can delete, it will report back to you as in
the following illustration. Generally you should simple accept the advice given i.e. you don't need to
change which boxes are ticked.

180 | P a g e
Step 3

When you click on OK, Disk Clean-up will ask "are you sure..."

You should click on Yes

Step 4

Disk Clean-up will then begin the cleaning process and this may take a few minutes,

Document the signs and symptoms of the problem and its solution, and load
to database of problems or solutions for future reference

What is a system support log?


A system support log refers to paper-based or electronic documentation which is maintained by help
desk and support personnel and technicians that might have responsibility for the fault finding
process.
The reasons for maintaining and developing such logs are varied and very important to any successful
support and help desk operation. These reasons can be summarised as follows:
 Provide historical data This data allows a manager to access historical data, which can be
analysed for trends, and to help develop preventative measures.

181 | P a g e
 Identify preventative measures Support logs provide an IT manager with data that can help
identify faults and how to prevent these. If faults can be prevented, the efficiency and
effectiveness of an IT operation will be greatly enhanced.
 Build knowledge bases/support databases These data/knowledge bases allows support
personnel to have access to historical data and possibly construct procedures for dealing with
these faults in the future. Technicians may use these records as future reference.
 Simplify the troubleshooting process If someone else encountered the same or similar
problem before, technicians will benefit by having rapid access to the solution or the
strategies which may lead to the solution.
 Provide Key Performance Indicators (KPI) Many support operations are often outsourced;
hence, historical data will help develop contracts and Service Level Agreement (SLA). If SLAs
are in place, these KPI may be used to assess and review performance against criteria within
SLAs.

Activity 12

In this activity you will practise maintaining a system support log. You will use a sample MS
Access Database to enter fault data. You will also review the applications and make
suggestions for improvements.

 Download this example of a simple database application (.mdb 230KB -


http://lrrpublic.cli.det.nsw.edu.au/lrrSecure/Sites/Web/6196/lo/2324/documents/2324_s
upportlog_dbase_co.mdb) for reference. A Microsoft Access 2000 database file is provided.
NOTE: your computer security system may interfere with the download of this file. Save the
file to your hard drive, run your virus software over it and then open the file in MS Access.
 Practise adding a few records to the database
 Review all of the data fields that allow you to enter information

Make recommendations as to what improvements could be made to the database system


to improve its functionality and usefulness.

182 | P a g e
Activity 12

183 | P a g e
What should it include?

Generally, system support logs will vary from organisation to organisation. Depending on the
requirements, organisations will develop system support logs, by following internally developed
standards or by following industry standard guidelines.
Nowadays system support logs are kept electronically in databases. Some organisations might still
have requirements for some amount of paper-based documentation and logs, but increasingly,
support teams are implementing electronic systems.
Electronic system support logs can be internally developed, by in-house software/application
developers, or they can be purchased off-the-shelf as a readymade solution for support/help desk
operation.
In house electronic system support logs Many organisations develop their own systems for system
support logs. The reasons for this may include:
 Lower costs Internally developed system support logs might work out cheaper than
purchasing expensive commercial solutions.
 In-house software development expertise If a business is lucky enough to have software and
application developers in their team, they might be engaged in developing an in-house
support solution
 Better caters for business needs In-house developed systems might better cater for the needs
of a business. Particularly, if there are specific technical requirements (ie compatibility issues)
Commercial system support solutions Generally, these products are purpose built to assist support
and help desk operations. Commonly, commercial product’s features include:
 Specialist software Purpose- built for the task
 Flexibility Suitable for more situations, one solution fits all
 Support Producers provide ongoing support and continued development of the product
 Implements best practice Generally, product follow industry trends and guidelines of what
constitutes best-of-breed practice
 Support for standards Software is fully compatibility with most platforms
 Products adheres to quality standards and guidelines Many commercial products already
implement functions which are compliant with standards such ISO 9000 and ITIL
 Can be expensive Many commercial products might have prohibitive costs for SMEs.
In terms of what type of information is actually maintained in system support logs, it is difficult to
predict, given differing requirements from business to business. However, generally, the following
items are common in most support logs and incident management systems for tracking incidents and
faults.
 Issue #
 Initiator (who logged the call)
 Initiator Extension or Phone #

184 | P a g e
 Date/Time Opened
 Summary Description
 Impact/Importance
 Type (of fault)
 Owner (of system)
 Current Status (open, in process, closed)
 Next Step
 Next Step Date
 Completion Date
 Resolution, development request # or link to vendor support request
The above list is not exhaustive and does not include all possible items that the business might need
to include. Again, businesses will assess their needs and choose what information is required and build
and maintain logs accordingly.

Examples of support logs


System support logs can be developed in house, or purchased as an off-the-shelf commercial solution.
Many businesses that choose to develop their system support log solution in-house, can take several
approaches to developing these:
An electronic data base system – in this case, databases can be large systems running on dedicated
database servers, and have special front-end applications to interface with the database; web-based
interfaces, where technicians use an Internet browser to use the system; or, more simple database
solutions using a product such as MS Access or FileMaker Pro. The following is an example of a simple
system support log developed in Microsoft Access:

Figure: example of a simple system support log


An electronic purposely built application – these are developed as fully fledged systems, using an
object oriented development environment such as JAVA, Visual Basic.NET or C++.

185 | P a g e
Simple spreadsheet applications – Small businesses particularly resort to less sophisticated solutions
such as customised spreadsheet applications using a product such as MS Excel. An example is shown
below:

Figure: Simple spreadsheet application


Commercial applications are varied and examples are abundant. A simple search on an Internet search
engine on Help Desk software will return thousands of hits with reference to dozens of applications.
With system support logs being critical in the support role, all help desk software applications will
provide some form of logging, or fault tracking. A common term to define fault-tracking is ‘ticketing’.
Whenever, someone logs a call with the help desk, an electronic ‘ticket’ is issued, becoming a
reference number. Any subsequent calls will need to refer to this ticket number/incident number.
Among the providers of help desk software is the BMC Remedy IT Service Management software suite
(www.bmc.com/remedy). Remedy provides a whole range of software applications that addresses
help desk support, change management, knowledge management and all aspects of the IT lifecycle.
Go to the BMC website above and look under "Product Families. Choose the "IT Service Management"
link and take a look at the product sheets for screenshots and details.
The company Altiris also supplies enterprise-grade help desk software. The Altiris software offers
similar features to Remedy, becoming a large enterprise-grade solution. For more information about
Altiris products, visit their website at www.altiris.com.
A smaller application that is becoming well known is Intuit’s (http://www.itsolutions.intuit.com/)
Track-It! You were already introduced to the Track-It! in an earlier Learning pack. Track-It! offers full
help-desk capabilities and call/incident logging. Go to the Track-IT! website
(http://www.itsolutions.intuit.com/) and select "Tour Track-It!" for an excellent short overview of this
software. You may also download an evaluation version of Track-It! Note that free registration is
required.

186 | P a g e
Organisational standards and requirements

There are several approaches to determining what standards are required. The following will explore
what are the options on which businesses may base their decisions.

In-house developed standard

In this case, the business will actually work out in-house what the standard should be in terms of
documentation and maintaining system support logs. The benefit of this approach is that a business
may achieve maximum flexibility, developing a system which is carefully tailored to the needs of the
business. On the other hand, businesses can fall into the trap of maintaining inadequate records and
maintenance, if a proper development process does not take place – this will result in slack record
maintenance rendering support ineffective.

International and national standards


National standard bodies (Standards Australia – www.standards.com.au) and international bodies
(such as the International Standard Organisation – ISO, www.iso.org) have developed and
promulgated standards for the keeping of system documentation and covering support logs. The
actual standard that is applicable to the development and management of systems documentation is
AS 3876—1991 (based on ISO 6592-1985). Clearly, businesses that choose to maintain support logs
and follow these standards will benefit by implementing best practice, maximising effectiveness and
efficiency.

Industry de-facto standards


De facto standards are standards that haven’t been ratified or endorsed by national or international
bodies, but have been accepted by industry as valuable. De facto standards do not become standards
overnight, but they gain their recognition by having numerous businesses that choose to implement
the system. As far as documentation, and managing the IT lifecycle, a UK based organisation has
become prominent around the world due to their highly respected work. This organisation is the
Information Technology Infrastructure Library – ITIL (www.itil.co.uk), which originated nestled in the
UK government and public service system. Many vendors such as Microsoft embrace ITIL, and even
help desk software developers include support for ITIL guidelines – one such product is Remedy.
It is important to stress the criticality of implementing system support logs that follow a well-
developed and sound standard, which will underpin the effectiveness and efficiency of a help desk
operation.

Maintaining support logs

System support log maintenance is generally the responsibility of technicians, engineers, help desk
operators and anyone involved in the fault finding process. The way in which the log is maintained will
fundamentally depend on the system being used.
Manual and paper based systems – personnel are fully responsible for filling out and completing paper
based documentation. Generally, businesses that use paper based or manual system would have
developed standard forms which are widely available to all involved in the fault finding process.

187 | P a g e
Due to the nature of paper based systems, careful records must be maintained to make the system
an efficient one. Inadequate filing and keeping of paper based support logs will result in an inefficient
system. Consequently technicians might have difficulty finding historical data, which might be relevant
to current faults. Clearly, electronic systems are simpler to maintain and provide a wider range of
functions, such as data analysis, fast retrieval of data, trending, reporting etc.
Electronic systems – As with paper based systems, the maintenance of logs is the responsibility of
technicians, engineers, help desk operators and anyone involved in the fault finding process.
Electronic systems, whether developed in-house or commercially developed, are far more efficient
and easier to maintain. Electronic systems are software based and generally widely available to
anyone who requires access to it. The methods for maintaining and updating these logs will depend
directly on the system, and what sorts of interfaces are available. Generally, the following methods
might be used to update and maintain system support logs:
 Simple systems developed in-house, such as customised spreadsheet applications are
generally updated directly. The data entry is performed directly on the spreadsheet.
Sometimes an interface might be developed to streamline the data entry process, and perhaps
to provide additional functions such as reports, graphics and charts. These spreadsheet
applications are generally only available to a small number of computers. Generally only
suited to small operations.
 Small database systems such as those developed using products such as Microsoft Access and
FileMaker Pro might be developed as applications. For instance, people that develop Access
databases, generally provide an interface in the form of a switchboard, with links to data entry
forms, reports, charts etc. These systems are generally only available to a small number of
computers. Generally only suited to small operations.
 Medium and large database systems, such in-house or commercially developed products,
generally provide more sophisticated interfaces for data entry. These are widely available to
network users throughout an entire enterprise over a LAN and possibly a WAN. These
products might also make use of Internet technologies, such as web servers and browsers.
Generally, the database system is regarded as a ‘back end’ system, set-up on a dedicated
database network server. The client accesses the system via purpose-built interfaces
developed as client applications. Increasingly, help desk support systems rely on web browser
technology to provide a ‘web-like’ interface which users can access anywhere without the
need for a special interface or client application. Remedy is an example of a product that
allows the user to interact with the system via a web browser.
Regardless of the system being used, maintenance of system support logs is critical to the
effectiveness and efficiency of a support operation. Ultimately, it is up to support personnel to update
the records and ensure that all data entry is performed accurately and timely – without disciplined
and accurate maintenance even the most sophisticated electronic systems will not be effective.

188 | P a g e
Activity 13

Internet Activity — Commercial Applications

In this activity you will conduct Internet searches to find commercial products that can be used for
implementing and maintaining system support logs. You will then collate your findings, so that their
features may be compared.

Using your favourite search engine, search the Internet for Help Desk support type applications.
Consider the following criteria for selecting suitable products:

Features

 Call tracking
 Asset Inventory
 Knowledge base support
 Reporting and trending
 Data may be output as graphics and charts
 Licensing costs

Use the criteria above to compare help desk support applications.

189 | P a g e
Activity 13

190 | P a g e
Information Technology
ICTSAS426 LOCATE AND TROUBLESHOOT ICT
EQUIPMENT, SYSTEM AND SOFTWARE FAULTS
ICTSAS426 Locate and troubleshoot ICT
equipment, system and software faults
Choose the most appropriate
fault finding method

Analyse the problem to be


solved

Identify a solution and rectify


the problem

Test system and complete


documentation

INTERPERSONAL SKILLS CUSTOMER INTERPERSONAL


SKILLS
• Interpersonal skills are also known as people skills
• Proper utilization requires the use of active listening, tone of • Greet your customer to instill a favorable first impression
voice, and the interactions between individuals
• Help unload the computer if needed
• There are two groups of people you utilize these skills with
• Ask about the problem
• Customers
• Allow the customer to explain
• Co-workers
fully without interruption

DIAGNOSE THE PROBLEM QUESTIONS TO ASK THE


CUSTOMER – USER ERROR
• Listen for clues from the customer to ascertain if the problem is
a quick fix or is more involved. • How is the machine setup at home?

• If it is a quick fix, resolve the issue at that time • Are you familiar with the software being used?
• If the problem is not, proceed to ask more questions about the • How long has the problem been occurring?
symptoms
• Is there any information you can offer about why the problem
• Give the customer a timeframe as to when the repair should be
completed
began?
• What troubleshooting steps have you attempted already?
• How can you duplicate the issue?
TROUBLESHOOTING – FILLING OUT A WORK ORDER
SOFTWARE/HARDWARE
• Once setup and user error are eliminated, the • Obtain the customer’s important contact information
problem is narrowed down to hardware or software • Name
• If the problem is software related the repair often can be • Phone
completed in a shorter period of time
• Address
• Repair costs should be lower
• Hardware related repairs often require more time to
complete
• Hardware repairs also tend to cost more

• Diagnose whether it’s hardware, software,


or both

FILLING OUT A WORK ORDER FILLING OUT A WORK ORDER

• Computer information • Explain to the customer the repair process


• Make and cost associated with it
• Model
• Repair process
• Serial #
• Short term fix / Long term fix
• Date problem arose
• Initial diagnosis
• Costs associated with the repair
• Diagnosis fee
• Labor/time
• Parts

FILLING OUT A WORK ORDER UTILIZING COMPUTER MANUALS


AND BASIC RESOURCES

• Obtain customer’s consent


• Begin by logging time of work on the work order
• Customer signs work order
• Identify the problem by using your own basic knowledge
• Inform customer you will contact them upon verification of the
problem with final costs to complete the repair • Consult with co-workers if you are unable to identify the source
of the problem
• Inquire if the customer would like the old parts back
• Utilize the manufacturers’ owners manual or online database
• Verify the best means of contact and time of day
for frequently known issues with a particular model
• Utilize online resources (forums/discussion boards)
COURSE OF ACTION CHECKING OF A REPAIR

• Map out a plan • Reconnect and power up the computer


• Identify what needs to be done • Attempt to duplicate the problem and verify that the repair did
• Collect the proper tools resolve the problem
• Software suite • Restore any hardware/software settings if necessary post-
• Hardware replacement parts repair
• Tools for removal and replacement of parts • Screen resolution
• Complete repair • User preferences
• Previously installed files

PERFORM COMPLIMENTARY FOLLOW-UP CUSTOMER CONTACT


COMPUTER PERFORMANCE
EVALUATION • Notify the customer that the computer is ready for pick up
• Explain the problem to the customer
• Clear temporary files from the computer
• If the computer showed any other issues, inquire as to how the
• Update Operating System software
customer would like to proceed
• Investigate to see that the computer’s files and Operating
• Notify them of the final cost of the repair
System are secure
• Discuss any possible upgrades resulting from the
• Anti-virus is installed
complimentary evaluation
• Anti-virus is still active
• Install an active Anti-virus software
• Malware is not being reported in the computer’s activity
• Larger hard drive
log
• More computer memory for system performance
• Perform a quick evaluation to see if the computer has any other
potential hardware issues • Verify time for customer to pick up their computer

CLOSING OUT A WORK ORDER CONCLUDING THE


TRANSACTION
• Conclude a work order by
• Writing up a formal diagnosis of the problem • Greet the customer as they enter to pick up their
• Steps taken to resolve the problem computer
• List of hardware/software that was installed or replaced • Walk through the work order with the customer
• Time to complete repair • Recap the diagnosis and what was done to resolve the problem
• Documentation of any calls made to the customer • Explain the price breakdown of the repairs
• Itemized price breakdown of the repair • Review complimentary computer evaluation if applicable

• Ensure the customer is satisfied with the work


INDEPENDENT / GROUP WORK INDEPENDENT / GROUP WORK
• Pick one of the following two scenarios and explain
completing a work order from beginning to end
• Scenario One
• You may work independently or in pairs • Sally walks into the shop and explains that her PC is
• Make sure to include running slowly. The computer has an antivirus that came
• Customer Greeting with it and has since lapsed. The customer’s
granddaughter gave the computer to her and she doesn’t
• Diagnosing the problem
understand how to fix it.
• How to fill out the initial work order
• Scenario Two
• Steps in repairing the computer
• Jon brings his laptop in because the computer doesn’t last
• You will only need to document the steps taken in
more than an hour on its battery before it turns off.
finding a solution, not the actual solution
• Post repair steps
• Customer pick up procedure

TROUBLESHOOTING METHODOLOGY

Online video follows (51 minutes):

21 22

THE POWER-ON SELF-TEST, AND THE POWER-ON SELF-TEST, AND ERROR


ERROR CODES CODES CONT..

If something is wrong with your computer, an error code will be On newer computers, these error codes may be replaced with
displayed on the top of your screen. For example, any error between English-language error messages, such as A Keyboard error or no
201 and 299 means that there is a problem with your RAM memory; keyboard present. However, many manufacturers still use error code
any error between 601 and 699 means there is a problem with your messages to report hardware problems. This forces the consumer to
floppy disk drive and/or floppy disk controller. You will be prompted haul the computer into the repair shop, because they don't
to press the "F1" key to continue booting the computer; normally, you understand the meaning of the error codes. On the course DVD,
will want to power-down the PC and repair the problem before there is an exhaustive list of IBM-compatible error codes and what
continuing to use it. each code means. For the A+ examination, you simply need to
memorize the major error code categories shown in the list above.
WHAT HAPPENS IF THE COMPUTER WHAT HAPPENS IF THE COMPUTER
JUST BEEPS AT YOU? JUST BEEPS AT YOU? CONT..
Here are some of what the most common DOS Audio Error
You may also hear a series of beeps when you turn on the computer, IF codes mean:
SOMETHING IS WRONG. Normally, you hear only one short beep. The • No display, no beeps: No power
one short beep (or two short beeps if you have a Compaq computer),
• Continuous beep: Power supply failure
indicates that the POST has completed, and it found no hardware errors
with the tested components. If there are hardware problems AND the • Repeating short beeps: Power supply failure
PC cannot display an error code or message to the screen, the • Two beeps: Unspecified problem; read message on
computer will beep in a predefined series of beeps to indicate exactly screen for further details (such as keyboard error, drive
what is wrong with your PC. This beeping is not random, and it can misconfiguration)
instruct you about exactly what is wrong with your PC.
• One long and two short beeps (or three short beeps,
or eight short beeps): Display adapter (video card) failure

STEPS IN THE BOOT PROCESS


CONT.. STEPS IN THE BOOT PROCESS
CONT..
2) As POST checks your computer, it looks to a record of data stored in
CMOS RAM that tells what kinds of components are in your PC.
Specifically, it records what type of video card, floppy drives, hard disk, 3) If POST finds that there is a problem with your PC, it will display an
memory and so forth are contained in your PC. POST will test your error message or an error code that tells specifically what is wrong
computer based on what it believes is in your PC. If the information is with the unit. If it cannot display such a message, it will beep in a
missing or incorrect, the PC will not recognize or use certain specific pattern that indicates exactly what is wrong. If everything is
components in your system. It is important to keep a record of what OK with the computer, POST will sound one beep to the system
specifically is inside your computer, and that you have a record of what is speaker, indicating that all of the tests passed normally with no
written into CMOS RAM. errors.

STEPS IN THE BOOT PROCESS STEPS IN THE BOOT PROCESS


CONT.. CONT..

4) The ROM BIOS will then look to the boot sector of either a floppy 6) Then, WINLOGON.EXE and EXPLORER.EXE are loaded; these
disk or a hard disk to find the boot loader program of your programs provide the user interface common to Windows, and also
operating system. If it cannot find this file in that location, the PC will allow you as a user to log onto the system. If any of these steps do not
give an error message to the screen. When it does find the file, it occur in a normal manner, your PC may not boot up as you would
loads the file into RAM, and then your operating system takes expect. Knowing the steps in the boot process will help you when
charge of the computer. NTLDR is the boot loader program for trouble-shooting or analyzing problems with your PC.
Windows NT through Windows 7. Linux uses a program called GRUB
to begin its boot process.
EQUIPMENT
MAINTENANCE
Maintenance refers to Actions necessary for retaining or restoring a We know change is inevitable and our systems need to be at par
piece of equipment, machine, or system to the specified operable with the changes, some reasons are:
condition to achieve its maximum useful life. •Political decisions (e.g. introduction of a new tax).
There are two major maintenances performed on computers; •Hardware related changes.
Preventive- Maintenance performed on equipment not necessarily •Operating system upgrades over time.
when they are faulty but to make sure breakdowns are minimized. It
•Competition - new features to be added.
involves changes made to a system to reduce the chance of future
system failure
Corrective - diagnosing and fixing errors, possibly ones found by
users

THINGS TO NOTE BEFORE HARDWARE MAINTENANCE


EQUIPMENT MAINTENANCE
How to plan your IT maintenance The purpose of hardware maintenance is to check the condition of
Checklist cables, components, and peripherals, Clean components to reduce the
likelihood of overheating, Repair or replace any components that show
Do your IT maintenance regularly.
signs of damage or excessive wear.
Don’t ignore IT maintenance.
Tasks that are performed during a hardware maintenance program:
Draw up a schedule for your IT maintenance.
Remove dust from fan intakes.
Software and hardware need maintenance.
Remove dust from the power supply.
Automate as much as possible.
Remove dust from components inside the computer.
Keep your documentation updated.
Make monitoring part of your IT maintenance.
Clean the mouse and keyboard.

Check and maintain your security Check and secure loose cables
Don’t overcomplicate IT maintenance.
Consider outsourcing IT maintenance.

BENEFITS OF PREVENTIVE
SOFTWARE MAINTENANCE MAINTENANCE
The purpose of this is to verify that installed software is current.
Use the tasks listed as a guide to create a software maintenance
schedule that fits the needs of your computer equipment: The following are the benefits of preventive maintenance:
Review security updates.
Review software updates.
Increases data protection
Review driver updates.
Extends the life of the components
Update virus definitions
Increases equipment stability
Scan for antiviruses and spyware
Reduces repair costs
Remove unwanted programs
Reduces the number of equipment failures
Scan harddrive for errors
Defragment harddrive
A SAMPLE MAINTENANCE
CHECK LIST
CREATE A
MAINTENANCE
SCHEDULE

MAINTENANCE BUDGET. MAINTENANCE TOOLKIT


What are the contents of maintenance tool kit?
Is it present?
•Screw drivers
Is the ICT officer involved in creating the budget?
•Hammer
What does the budget cover?(spare parts, external consultancy,
•Crimping tool
equipment replacement, tool kit)
•Pliers
•Digital multi Meter
•Blower

IMPORTANT TIPS
TROUBLE SHOOTING Always have a proper maintenance agreement/contract with target groups
Always have a proper maintenance check list
Always have a capacity building time table
What do you understand by the term trouble shooting? Always carry out a skills assessment of staff to guide on the type of training
to organize
What common problems can be handled at office level by all
employees? Always use different training manuals for different trainings
Always document processes on maintenance and keep records of all ICT
What technical gaps do you have with in your organization?
equipment
What problems can the ICT officer handle at organization level? As a technology focal person, always build the capacity in new
What nature of problem necessitates engaging an external ICT technologies (adaptation)

professional? ICT focal persons should always research on relevant and affordable
technologies and advise management on these technologies.
As a unit discuss ICT sustainability in the organization
Any Questions?
Student Assessment Information
The process you will be following is known as competency-based assessment. This means that
evidence of your current skills and knowledge will be measured against national and international
standards of best practice, not against the learning you have undertaken either recently or in the
past. (How well can you do the job?)

Some of the assessment will be concerned with how you apply the skills and knowledge in your
workplace, and some in the training room.

The assessment tasks utilized in this training have been designed to enable you to demonstrate the
required skills and knowledge and produce the critical evidence required so you can successfully
demonstrate competency at the required standard.

What happens if your result is ‘Not Yet Competent’ for one or more assessment tasks?

The assessment process is designed to answer the question “has the participant satisfactorily
demonstrated competence yet?” If the answer is “Not yet”, then we work with you to see how we can
get there.
In the case that one or more of your assessments has been marked ‘NYC’, your Trainer will provide
you with the necessary feedback and guidance, in order for you to resubmit/redo your assessment
task(s).
What if you disagree on the assessment outcome?

You can appeal against a decision made in regards to an assessment of your competency. An appeal
should only be made if you have been assessed as ‘Not Yet Competent’ against specific competency
standards and you feel you have sufficient grounds to believe that you are entitled to be assessed as
competent.
You must be able to adequately demonstrate that you have the skills and experience to be able to
meet the requirements of the unit you are appealing against the assessment of.
You can request a form to make an appeal and submit it to your Trainer, the Course Coordinator, or
an Administration Officer. The RTO will examine the appeal and you will be advised of the outcome
within 14 days. Any additional information you wish to provide may be attached to the form.
What if I believe I am already competent before training?

If you believe you already have the knowledge and skills to be able to demonstrate competence in this
unit, speak with your Trainer, as you may be able to apply for Recognition of Prior Learning (RPL).
Credit Transfer
Credit transfer is recognition for study you have already completed. To receive Credit Transfer, you
must be enrolled in the relevant program. Credit Transfer can be granted if you provide the RTO with
certified copies of your qualifications, a Statement of Attainment or a Statement of Results along with
Credit Transfer Application Form. (For further information please visit Credit Transfer Policy)

191 | P a g e
LEARNING OUTCOMES
The following critical aspects must be assessed as part of this unit:

1. Interact with customers, collect the necessary information and match customers' needs to company
products or service
2. Sell products and services including matching customers' requirements to company products and
services and finalise and record the sale

LEARNING ACTIVITIES

Class will involve a range of lecture based training, activities, written task, case study and questioning.

STUDENT FEEDBACK

We welcome your feedback as one way to keep improving this unit. Later this semester, you will be
encouraged to give unit feedback through completing the Quality of Teaching and Learning Survey

LEARNING RESOURCES
Other Learning Resources available to students include:

 Candidate Resource & Assessment: ICTSAS426 - Locate and troubleshoot ICT equipment,
system and software faults
 Presentation handout
 PPT Presentation

TEXTBOOKS

You do not have to purchase the following textbooks but you may like to refer to them:

Unit Code(s) Unit Title Reference Book/ Trainer & Learner Resource

ICTSAS426 Locate and troubleshoot ICT  Network Security Essentials: Applications


equipment, system and and Standards, 4th Edition; William
software faults Stallings
 A Developers Guide to Network Security;
Richard Conway, Julian Cordingley

192 | P a g e
 Firewalls and Network Security; Michael E.
Whitman, Herbert Mattford, Richard
Austin, Greg Holden
 Network Security: The Complete
Reference 1st Edition; Roberta Bragg,
Keith Strassberg, Mark Rhodes-Ousley
 Network Security Bible 2nd Edition; Eric
Cole
 Cryptography and Quantum Computing:
Securing Business Information; Bradley
Tice

Additional Reference Texts  Network Security Essentials: Applications


and Standards, 4th Edition; William
Stallings
 A Developers Guide to Network Security;
Richard Conway, Julian Cordingley
 Firewalls and Network Security; Michael E.
Whitman, Herbert Mattford, Richard
Austin, Greg Holden
 Network Security: The Complete
Reference 1st Edition; Roberta Bragg,
Keith Strassberg, Mark Rhodes-Ousley
 Network Security Bible 2nd Edition; Eric
Cole
 Cryptography and Quantum Computing:
Securing Business Information; Bradley
Tice

ASSESSMENT DETAILS

Assessment Summary
The assessment for this unit consists of the following items.

Knowledge Assessment

Task 1 – Troubleshooting ICT Faults

Formative Activities
In addition to the two assessment tasks, students will be required to complete activities as outlined
by their trainer/assessor – these will be taken from class resources, Enhance Your Future Learner
Guides.

193 | P a g e
Referencing Style
Students should use the referencing style outlined by the Trainer when preparing assignments. More
information can be sought from your Course Trainer.

Guidelines for Submission


1. An Assignment Cover Sheet (or cover page) must accompany all assignments at front to
confirm it is your own assessment/ work.

2. All assignments must be within the specified timeframe (please refer to Due Date).

Assignment Marking
Students should allow 14 days’ turnaround for written assignments.

Plagiarism Monitoring
Students should use the referencing style outlined by when preparing assignments. More information
can be sought from your Trainer.

Marking Guide
C Competent: for students who have achieved all of the learning outcomes specified for
that unit/module to the specified standard.

NYC Not Yet Competent: for students who are required to re-enrol in a unit/ module in their
endeavour to achieve competence

S Satisfactory: has achieved all the work requirements

NS Not Satisfactory: has not achieved all the work requirements

Every student at Danford College can expect to have “timely fair and constructive assessment of
work.” Assessment tasks must be marked in such a way that the result reflects how well a student
achieved the learning outcomes and in accordance with the assessment criteria. In addition to the
result, returned assignments must be accompanied by feedback that clearly explains how the
marking result/s was derived (summative), as well as how the student can improve (formative).

194 | P a g e
Refer to observation checklist below and/or consult your trainer/assessor for marking criteria for
this unit.

STUDENTS’ RIGHTS AND RESPONSIBILITIES


It is the responsibility of every student to be aware of all relevant legislation, policies and procedures
relating to their rights and responsibilities as a student. These include:
 The Student Code of Conduct
 The College’s policy and statements on plagiarism
 Copyright principles and responsibilities
 The College’s policies on appropriate use of software and computer facilities
 Students’ responsibility to attend, update personal details and enrolment
 Course Progress Policy and Attendance
 Deadlines, appeals, and grievance resolution
 Student feedback
 Other policies and procedures.
 Electronic communication with students

International Students Please also refer to ESOS framework for further details
https://internationaleducation.gov.au/Regulatory-Information/Education-Services-for-Overseas-
Students-ESOS-Legislative-Framework/ESOS-Act

ADDITIONAL INFORMATION

Contacts:
If you have a query relating to administrative matters such as obtaining assessment results, please
contact your Course co-ordinator.

Deferrals/Suspensions/Cancellations
Danford College will only allow deferrals/student requested suspensions under exceptional
compassionate circumstances. Once a student has commenced studies, students are not allowed to
take leave unless there are compelling and compassionate reasons. Please refer to the College’s
Deferment, Suspension and Cancellation Policy available in the Student Handbook and at Student
Administration. This policy has been explained to you at Orientation.

195 | P a g e
Course Progress Policy
You are expected to attend all classes and complete your units of study satisfactorily, within your term.
Your Course Trainer will make a report to the Course co-ordinator if there are any concerns about your
progress. The Course Progress Policy is available to you in the Student Handbook and at Student
Administration or on college website www.danford.edu.au.

Assessment Conditions

Gather evidence to demonstrate consistent performance in conditions that are safe and replicate
the workplace. Noise levels, production flow, interruptions and time variances must be typical of
those experienced in the systems administration and support field of work, and include access to
special purpose tools, equipment, materials and industry software packages including:

 system to be diagnosed
 diagnostic and fault finding tools
 technical and system documentation
 organisational requirements for documenting solution.

Assessors must satisfy SRTO2015/AQF assessor requirements.

196 | P a g e
Lesson/Session Plan
For face-to-face classroom based delivery as per time table.

Delivery Day Delivery Topics Activities to be undertaken


1 Introduction to ICTSAS426 Locate and Work through corresponding sections of
troubleshoot ICT equipment, system and Learner Materials and Assessment Tasks
software faults and Assessment Activity 1 (Page 31)
Requirements Overview (Page 3) Activity 2 (Page 39)
Develop a troubleshooting process to help Commence Written Questions
resolve problems (Page 6) PowerPoint Presentation – Slides 1 - 22
Analyse and document the system that
requires troubleshooting (Page 34)
2 Identify available fault finding tools and Work through corresponding sections of
determine the most appropriate for the Learner Materials and Assessment Tasks
identified problem (Page 42) Activity 3 (Page 49)
Obtain the required fault finding tools
(Page 50)
3 Identify legislation, health and safety Work through corresponding sections of
requirements, codes, regulations and Learner Materials and Assessment Tasks
standards related to the problem area Activity 4 (Page 75)
(Page 56) Commence Task 1 – Troubleshooting ICT
Collect data relevant to the system (Page Faults
63)
4 Analyse the data to determine if there is a Work through corresponding sections of
problem and the nature of the problem Learner Materials and Assessment Tasks
(Page 77) Activity 5 (Page 100)
Determine specific symptoms of PowerPoint Presentation – Slides 23 - 30
hardware, operating system and printer
problems (Page 84)
Formulate a solution and make provision
for rollback (Page 103)
5 Systematically test variables until the Work through corresponding sections of
problem is isolated (Page 115) Learner Materials and Assessment Tasks
Activity 6 (Page 135)
Activity 7 (Page 146)
6 Rectify the problem (Page 148) Work through corresponding sections of
Learner Materials and Assessment Tasks
Activity 8 (Page 150)
Activity 9 (Page 153)
7 Create a list of probable causes of the Work through corresponding sections of
problem (Page 155) Learner Materials and Assessment Tasks
Activity 10 (Page 170)
8 Test the system to ensure the problem has Work through corresponding sections of
been solved and record results (Page 172) Learner Materials and Assessment Tasks
Activity 11 (Page 174)
9 Identify and implement common Work through corresponding sections of
preventative maintenance techniques to Learner Materials and Assessment Tasks
support ongoing maintenance strategies Activity 12 (Page 182)
(Page 176) Activity 13 (Page 188)

197 | P a g e
Delivery Day Delivery Topics Activities to be undertaken
Document the signs and symptoms of the Complete Written Questions
problem and its solution, and load to PowerPoint Presentation – Slides 31 - 43
database of problems or solutions for
future reference (Page 181)
10 ASSESSMENT Complete Task 1 – Troubleshooting ICT
Faults

198 | P a g e
Knowledge Assessment (Written Tasks)

1. A customer tells you that his or her iMac has stopped working. The first thing to do is:
(a) Run Apple Service Diagnostic.
(b) Try quick fixes.
(c) Run MacTest Pro tests.
(d) Gather more information.

2. The hard disk does not appear on the desktop of a PowerBook G4. You cannot resolve the
situation over the phone, so the customer brings the system to you for repair. What items would be
useful for repairing the issue? (Choose all that apply.)
(a) Replacement keyboard and mouse
(b) Apple Service Diagnostic
(c) System software CDs
(d) Replacement module for the hard disk
(e) Tools

3. Which of the following is not an example of a split-half search process?


(a) Check for software issues before replacing any hardware.
(b) Remove external devices and internal cards, and test the computer by itself.
(c) If a module is easy to replace, swap it right away.
(d)Inspect components visually.

4. A customer's Power Mac G4 running Mac OS X 10.2.6 does not turn on. What is the first step to
take?
(a) Run Apple Service Diagnostic.
(b) Refer to Service Source.
(c) Check the power source and cable connections.
(d) Reset the PRAM.

5. What is the first step to take when a computer with a CRT display starts up to a black screen?
(a) Run Apple Hardware Test.
(b) Adjust the brightness and contrast controls.
(c) Rebuild the desktop.
(d) Reset the PRAM.

6. What are the five components of a minimal system for a Power Mac G4 (Quicksilver)?

199 | P a g e
7. When you first start up your minimal system, you do not get any sound. What component should
you check first?

8. If your minimal system is starting up correctly, what component do you add first?

9. You get no start-up sound from the system after swapping the main logic board. What
components are likely at fault?

10. Why is it important to check cables?

11. What are the two ways hardware can fail?


(a). electronically
(b) Statically
(c) Physically
(d) Sonically

12. When determining if the problem is related to hardware or software, which of the
following is one of the most important questions to ask.
(a) Is the computer plugged in
(b) What operating system are you using
(c) Have you recently installed any new hardware or software
(d) Does the monitor work

13. If the mouse pointer moves intermittently on the screen but has not failed altogether,
which of the following troubleshooting steps should you take?
(a) Check its connection to the port
(b) Clean the mouse
(c) Reinstall the drivers

200 | P a g e
(d) Check for a conflict with the modem

14. Which of the following error messages might you receive when there is a problem with
ram? Choose all that apply.
(a) Memory address error
(b) Memory test fails
(c) comes error
(d) Memory parity error at xxx

15. Which of the following could be used to fix windows start up problem? Choose all that
apply.
(a) Use last know good configuration
(b) Use safe mode
(c) Use CMOS to change settings to default
(d) Use a bootable disk from an earlier version of you operating system

16. A user complains that his PC crashes during windows XP start up. Which of the following
would you suggest for him to correct the problem?
(a) Change his CMOS settings
(b) Enable VGA mode from windows advanced options
(c) Use last known good configurations
(d) Use check disk utility

17. Which of the following could help you fix intermittent pc problems? Choose all that
apply.
(a) Use system restore (Missed)
(b) Perform a clean boot (Missed)
(c) Use checksum utility
(d) Use undo recent changes option (Your Answer)

18. Sally complains that her computer has changed the drive letters and she does not
understand why. What should you suspect, based on what sally has told you?
(a) The hard drive is about to crash
(b) The operating system is corrupt
(c) She has recently added or removed a storage device
(d) Her system has a virus

201 | P a g e
Task 1 – Troubleshooting ICT Faults
You will be presented with a PC with no less than three faults introduced by your Assessor.
This assessment task requires you to:

 determine the most appropriate fault finding method


 document the troubleshooting process
 analyse and identify faults
 obtain suitable tools and equipment
 apply simple checks, tests and fault finding methodologies
 Apply the recommended means to rectify fault and document results.

Your organisation provides the following basic process flows:

202 | P a g e
203 | P a g e
204 | P a g e
As you work through the faults/issues, complete the following -

Fault/Issue 1
Summaries the category of the fault you are diagnosing:

Describe the planned process to isolate the fault/issue:

205 | P a g e
List the tools used to determine the fault/issue:

After utilizing the relevant tools, summaries the outcomes:

List the specific symptoms of the problem:

State the identified cause of the problem:

Describe the solution to the problem:

206 | P a g e
Describe how the solution was implemented:

Outline the tests to the ensure suitability of the solution:

List the relevant ongoing maintenance requirements to prevent the fault/issue from occurring in the
future:

Outline any subsequent faults/issues that may arise from the rectified problem:

207 | P a g e
Fault/Issue 2
Summaries the category of the fault you are diagnosing:

Describe the planned process to isolate the fault/issue:

List the tools used to determine the fault/issue:

After utilizing the relevant tools, summaries the outcomes:

List the specific symptoms of the problem:

208 | P a g e
State the identified cause of the problem:

Describe the solution to the problem:

Describe how the solution was implemented:

Outline the tests to the ensure suitability of the solution:

209 | P a g e
List the relevant ongoing maintenance requirements to prevent the fault/issue from occurring in the
future:

Outline any subsequent faults/issues that may arise from the rectified problem:

Fault/Issue 3
Summaries the category of the fault you are diagnosing:

Describe the planned process to isolate the fault/issue:

210 | P a g e
List the tools used to determine the fault/issue:

After utilizing the relevant tools, summaries the outcomes:

List the specific symptoms of the problem:

State the identified cause of the problem:

211 | P a g e
Describe the solution to the problem:

Describe how the solution was implemented:

Outline the tests to the ensure suitability of the solution:

List the relevant ongoing maintenance requirements to prevent the fault/issue from occurring in the
future:

Outline any subsequent faults/issues that may arise from the rectified problem:

212 | P a g e
213 | P a g e
ICT40418 CERTIFICATE IV IN INFORMATION TECHNOLOGY
NETWORKING
College Copy

Unit Code and Title: ICTSAS426 Locate and troubleshoot ICT equipment, system
and software faults
Assessment task Due Dates

Assessment 1 Due Date:

Assessment 2 Due Date:

I Student ID acknowledge receiving the

Student Assessment Information Pack which contains:

o Assessment Due Date Sheet


o Time table / Training Plan
o Lesson Plan
o Student Assessment Information Guide
o Assessment Cover Sheets
o Feedback form
o Student Resource
o Internet Access for online Business Environment Simulation with Login Key or access to college
simulated business documents on internal intranet.

Student Signature:

Date :

214 | P a g e
ICT40418 CERTIFICATE IV IN INFORMATION TECHNOLOGY
NETWORKING
Student Copy

Unit Code and Title: ICTSAS426 Locate and troubleshoot ICT equipment, system
and software faults
Assessment task Due Dates

Assessment 1 Due Date:

Assessment 2 Due Date:

I Student ID acknowledge receiving the

Student Assessment Information Pack which contains:

o Assessment Due Date Sheet


o Time table / Training Plan
o Lesson Plan
o Student Assessment Information Guide
o Assessment Cover Sheets
o Feedback form
o Student Resource
o Internet Access for online Business Environment Simulation with Login Key or access to college
simulated business documents on internal intranet.

Student Signature:

Date :

215 | P a g e
ASSESSMENT SUMMARY / COVER SHEET
This form is to be completed by the assessor and used a final record of student competency.
All student submissions including any associated checklists (outlined below) are to be attached to
this cover sheet before placing on the students file. Student results are not to be entered onto the Student
Database unless all relevant paperwork is completed and attached to this form.
Student Name:

Student ID No:

Final Completion Date:

Unit Code: ICTSAS426

Unit Title: Locate and troubleshoot ICT equipment, system and software faults

Unit
Assessors Name:
Outcome
C NYC
Result: S = Satisfactory, NYS = Not Yet Satisfactory, NA = Not Assessed
 Knowledge Assessment - Questions and Answers
S | NYS | NA
 Task 1 S | NYS | NA

Is the Learner ready for assessment? Yes No


Has the assessment process been explained? Yes No
Does the Learner understand which evidence is to be collected and
Yes No
how?
Have the Learner’s rights and the appeal system been fully
Yes No
explained?
Have you discussed any special needs to be considered during
Yes No
assessment?
I agree to undertake assessment in the knowledge that information gathered will only be used for
professional development purposes and can only be accessed by my manager and the RTO:
Learner Signature:
Date:
I have received, discussed and accepted my result as mentioned above for
this unit assessment and I am aware about my rights to appeal.
Assessor Signature:
Date:
I declare that I have conducted a fair, valid, reliable and flexible
assessment with this student, and I have provided appropriate feedback.

216 | P a g e
ASSESSMENT COVER SHEET

ICTSAS426 Locate and troubleshoot ICT equipment, system and software


Unit
faults

Course ICT40418 CERTIFICATE IV IN INFORMATION TECHNOLOGY NETWORKING

Student Name: Student ID:

Group: Date

Title of
Knowledge Assessment - Questions and Answers
Assignment:

Assessor Name:

This cover sheet must be attached to your assignment.

Declaration:
1. I am aware that penalties exist for plagiarism and unauthorized collusion with other
students.
2. I am aware of the requirements set by my educator with regards to the presentation
of documents and assignments.
3. I have retained a copy of my assignment.

Student Signature: ___________________________

Date: ________________________________________

217 | P a g e
QUESTION & ANSWER CHECKLIST

ICTSAS426 - Locate and troubleshoot ICT equipment, system


and software faults S NYS
Learner’s name:

Assessor’s name:

Question Correct ()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Feedback to Learner:

Assessor’s Signature:
Date:

218 | P a g e
ASSESSMENT COVER SHEET

ICTSAS426 Locate and troubleshoot ICT equipment, system and software


Unit
faults

Course ICT40418 CERTIFICATE IV IN INFORMATION TECHNOLOGY NETWORKING

Student Name: Student ID:

Group: Date

Title of
Task 1
Assignment:

Assessor Name:

This cover sheet must be attached to your assignment.

Declaration:
1. I am aware that penalties exist for plagiarism and unauthorized collusion with other
students.
2. I am aware of the requirements set by my educator about the presentation of
documents and assignments.
3. I have retained a copy of my assignment.

Student Signature: ___________________________

Date: ________________________________________

219 | P a g e
TASK 1 CHECKLIST

S NYS
Learner’s name:

Assessor’s name:

Observation Criteria S NS
Developed a troubleshooting process to help resolve problems
Analysed and documented the system that requires troubleshooting
Identified available fault finding tools and determined the most
appropriate for the identified problem
Obtained the required fault finding tools
Identified legislation, health and safety requirements, codes, regulations
and standards related to the problem area
Collected data relevant to the system
Analysed the data to determine if there is a problem and the nature of the
problem
Determined specific symptoms of hardware, operating system and printer
problems
Formulated a solution and make provision for rollback
Systematically tested variables until the problem is isolated
Rectified the problem
Created a list of probable causes of the problem
Tested the system to ensure the problem has been solved and record
results
Identified and implement common preventative maintenance techniques
to support ongoing maintenance strategies
Documented the signs and symptoms of the problem and its solution, and
load to database of problems or solutions for future reference
Feedback to Learner:

Assessor’s Signature:
Date:

220 | P a g e
Student Feedback Form
Unit ICTSAS426 Locate and troubleshoot ICT equipment, system and
software faults
Student Name: Date
Assessor Name:
Please provide us some feedback on your assessment process. Information provided on this form
is used for evaluation of our assessment systems and processes.
This information is confidential and is not released to any external parties without your written
consent. There is no need to sign your name as your feedback is confidential.
Strongly Strongly
Agree
Disagree Agree
I received information about the assessment
1 2 3 4 5
requirements prior to undertaking the tasks
The assessment instructions were clear and easy to
1 2 3 4 5
understand
I understood the purpose of the assessment 1 2 3 4 5

The assessment meet your expectation 1 2 3 4 5


My Assessor was organised and well prepared 1 2 3 4 5

The assessment was Fair, Valid, Flexible and Reliable 1 2 3 4 5

My Assessor's conduct was professional 1 2 3 4 5


The assessment was an accurate reflection of the unit
1 2 3 4 5
requirements
I was comfortable with the outcome of the
1 2 3 4 5
assessment

I received feedback about assessments I completed 1 2 3 4 5

Great
The pace of this unit was: Too Slow Too Fast
Pace
Comments:

221 | P a g e

You might also like