You are on page 1of 17

How To Compare Microsoft Excel Worksheets With

Florencesoft DiffEngineX

Overview

DiffEngineX reports the differences between two Microsoft® Excel workbooks. You
can choose whether to compare every worksheet contained in the two books (select
Whole Workbooks) or just selected sheets (select Selected Sheets). If you
compare at the Whole Workbook level, worksheets with the same name are
automatically compared with each other. If you want to compare two sheets with
different names you must select Selected Sheets and make sure only one sheet is
selected in each of the list boxes.

DiffEngineX only reports on the differences between the formulae and constants
found in worksheets. It does not report on cell comment differences, nor does it
compare charts.

Every comparison will generate a new workbook listing each cell difference. This
new workbook will contain one worksheet for every pair of sheets compared. Its
very last worksheet is a summary of the number of different cells found in each
pair.

DiffEngineX does not modify the workbooks you select to be compared in any way.
However when some of its options are selected it will create in-memory copies of
your workbooks and modify these instead. DiffEngineX needs to modify these
copies as part of the work done for Color Differences, Align Rows and Align
Columns.

A step-by-step tutorial on how to compare two lists of data can be found at the
bottom of this help page.

Click here to get the free trial version of DiffEngineX.

Important

If it seems DiffEngineX is failing to spot similarities between worksheets and as a


result is reporting spurious differences, you will need to select Align Columns and/or
Align Rows. Alignment is the insertion of blank rows and/or columns such that the
identical cells in two sheets end up with the same row and column numbers.
Identical cells are only recognised for what they are if they have the same row and
column numbers.

When Align Columns is selected you will be asked to select a row (or the rows)
containing column headings, after the Start Comparison button has been pressed. If
the sheets being compared do not have column headings, select a row present in
both sheets that has the same meaning and same row number.
When Align Rows is selected you will be asked to select a column (or the
columns) that will be looked at during the process of row alignment, after the Start
Comparison button has been pressed. Selection of columns A and/or B will work for
most cases. If this is not appropriate, select the first non-blank column present in
both sheets that has the same meaning and column number.

It is recommended the Color Differences is always selected as it offers a much


clearer way to see differences compared to a cell-by-cell listing.

Two important important options to find out about are Compact Like Changes When
Contiguous and The Actual Formulae/Their Calculated Values.

International Users

If your current regional settings do not match the language version of Office/Excel
installed, DiffEngineX may not work correctly.

To prevent problems it is recommended you select the option Ensure application


works when Excel language version not equal to Regional Settings.

Difference Report

The difference report is generated in a new workbook. It contains one worksheet for
every pair of worksheets compared. Its last sheet is a summary of the number of
differences found.

If row or column alignment is selected, the first entries show where the new blank
rows and columns have been inserted. They are inserted to maximize the number
of identical cells having matching co-ordinates between each pair of sheets. Cells
with identical content will be flagged as different unless they share the same co-
ordinates. The insertion of blank rows and/or columns by DiffEngineX indicates the
changes between the two worksheets have included the addition or deletion of rows
and columns.

The next entries list the cell differences. They are organized into five columns.

The first column contains the addresses of cells found to differ. (Each entry is the
address of a single cell unless the option Compact Like Changes When Contiguous
has been selected.) If row/column alignment is selected, the addresses refer to the
workbook copies made by DiffEngineX. These addresses will differ from the ones in
your original workbooks if blank alignment rows/columns have been inserted.

The next two columns quote the cell content found to differ.
The last two columns are only relevant if row and/or column alignment is selected.
They contain the cell addresses of the different content in the original workbooks
selected for comparison. These are unaffected by the insertion of blank alignment
rows and columns. Blank rows and columns are only inserted in workbook copies.
If you are referring back to your original workbooks you should use the
values in these two columns rather than the very first one.

Extras

When Color Differences is selected, the cells that differ between two sheets are
highlighted with color. When the comparision has finished, copies of the two
workbooks you selected are generated with the added color, in addition to the
difference report workbook. Once again these are _copies_ of your workbooks and
will not be saved to your hard drive unless you ask Microsoft® Excel to.

The Extras dialog allows you to specify what colors are used. It is invoked by
pressing the Extras button.

In the Extras dialog, a deleted cell is defined as a cell with content in Workbook #1,
but no content in Workbook #2. An addition is defined as a cell with content in
Workbook #2, but no content in Workbook #1. In this respect Workbook #1 can be
considered the original workbook and Workbook #2 the modified copy.

Existing Color Removal

Existing workbook color may make inspection of the results difficult. The Extras
dialog offers you the option of removing unconditional color from the workbook
copies before it starts to highlight the differences. Color is not removed from the
original workbooks you select.

Existing Hidden Sheets/Cells

Excel allows spreadsheet authors to hide whole rows and columns. Additionally
whole sheets may be made invisible. Differences may occur in these hidden
regions. Obviously there is little point in coloring a pair of cells if the results cannot
be seen. An option exists to unhide sheets, rows and columns on the workbook
copies. The original workbooks are not modified by selecting this option. Note that
selecting this option will only have an effect if Color Differences is also selected
on the main part of the user interface.

Hide Matching Rows

If large worksheets are compared and the different rows are widely separated,
inspection of all the color highlighted rows may be difficult. An option exists to hide
the matching rows, just leaving the different rows visible. Selecting Yes for this
option will hide all matching rows. Selecting Yes, but show 4 rows on either
side of each differing row as context will leave some matching rows visible.
Note that selecting this option will only have an effect if Color Differences is also
selected on the main part of the user interface.

If you require the individual characters that differ between cells precisely
highlighted on the difference report, see the option Color in Red Precisely The Parts
of Formulae and Text Constants That Differ.

Align Columns

If your are comparing an original worksheet to its modified copy and part of the
modifications included the insertion or deletion of columns you will need to select
Align Columns.

Without Align Columns similar regions in the sheets will be incorrectly reported as
different just because the same content is shifted to the left or right.

When the Align Columns checkbox is selected you will be asked to specify what
rows are used to help with column alignment after pressing the Start Comparison
button. A good choice is to select the row containing column headings, if one exists.
At the very least select the first row that contains content.

If the modifications that have taken place do not include the insertion or deletion of
columns this option should not be selected.

Align Columns works by inserting blank columns into the workbook copies being
compared. These blank columns are color highlighted. The color used can be
specified with the Extras dialog.

Align Columns Example

Consider the workbooks original.xls and modified.xls shown in Figure 1. You can see
the new Website column has been inserted into modified.xls.

original1.xls and modified1.xls are the result of a comparision without column


alignment. You can see the Personal and Work Emails have been incorrectly flagged
as different.

original2.xls and modified2.xls are the result of a comparison with column


alignment. Row 1 was added to the Selected Rows list using the Align
Columns dialog box. Row 1 contains the column headings of First Name, Last
Name, (Website), Personal Email and Work Email. The new Website column cells are
correctly flagged as new content by use of the color green.

Align columns also works when columns have been deleted.


Figure 1.

Align Rows

If you are comparing an original worksheet to its modified copy and part of the
modifications included the insertion or deletion of rows you will need to select Align
Rows.

Without Align Rows, similar regions will be incorrectly reported as different just
because the same content is shifted up or down.

When the Align Rows checkbox is selected you will be asked to specify what
columns are used to help with row alignment after pressing the Start Comparison
button. A reasonable choice is to select the first column with content.

If the modifications that have taken place do not include the insertion or deletion of
rows this option should not be selected.

Align Rows works by inserting blank rows into the workbook copies being
compared. These blank rows are color highlighted. The color used can be specified
with the Extras dialog.
Align Rows Example

Consider the workbooks original.xls and modified.xls shown in Figure 2. You can see
a new row with the content { 4400, Sports Car, 999 } has been added.

original1.xls and modified1.xls are the result of a comparison without row


alignment. You can see three rows have been incorrectly flagged as different, when
only 1 new row was added.

original2.xls and modified2.xls are the result of a comparison with row alignment.
Column A was added to the Selected Columns list using the Align Rows
dialog box. Column A contains order numbers that uniquely describe the contents
of each row. The new row is correctly flagged as new content by use of the color
green.

Align Rows also works when rows have been deleted.

Figure 2.
Options

Compact Like Changes When Contiguous

Selecting this option can potentially reduce the verbosity of DiffEngineX reports.

For example if three adjacent cells contain equivalent content and they are all
changed to the same formulae or constant, the change is reported on one line
instead of three.

For example

*
E2:G2 =A1*3 =A1*9

will be listed instead of

E2 =A1*3 =A1*9
F2 =B1*3 =B1*9
G2 =C1*3 =C1*9

*
For multi-cell ranges of equivalent formulae, the one A1 style formulae shown is
relative to the first cell of the range.

Color Alternate Rows

Selecting this option makes difference reports easier to read as every other row is
color highlighted.

Color in Red Precisely The Parts of Formulae and Text


Constants That Differ

Selecting this option highlights the exact parts of formulae and text constants, with
the color red, that differ between two worksheets. The highlighting is applied to the
cell content quoted on the difference report.

Dates and numeric constants are not covered by this option.

This option can slow down comparisons when the number of cell differences is
large. It is recommended that Compact Like Changes When Contiguous is selected
as well with this option in order to reduce the amount of work needed to be
performed on each difference report.
This option should not be confused with Color Differences which is available on the
main part of the user interface. Color Differences applies background color to whole
cells on copies of the workbooks selected for comparision. The option discussed
here applies foreground color to selected parts of formulae and text constants on
the difference report.

An example of the precise highlighting offered by this option is shown below.

E1 =A1+Costs+4 =A1+NewCosts+6
G2 The quick cat. The slow cat.

A1 or R1C1 Notation

The reports DiffEngineX generates contain cell content where it is found to differ. If
the differing content contains formulae this option allows it to be reported in either
A1 or R1C1 notation. In A1 notation rows are labeled numerically and columns are
labeled alphabetically. In R1C1 notation, both columns and rows are labeled
numerically.

Case Insensitive Comparisons

Select this option if you want cell content to be compared without regard to its
capitalization. For example when this option is checked the constant "Sales" will be
treated as equivalent to "sales".

The Actual Formulae or Their Calculated Values

If two cells containing formulae are being compared, a choice has to be made
whether to compare the actual formulae themselves or their calculated values.

For example if two cells containing =2*6 and =3*4 are compared with The Actual
Formulae checked they will be reported as different. If Their Calculated Values
is checked they will be reported as identical.

Ensure application works when Excel language version not


equal to Regional Settings

If your Control Panel Regional Options (such as French (Canada), Italian (Italy)
etc.) do match the localized language version of Excel you have installed,
DiffEngineX will generate error messages each time it is run.

To prevent problems you may wish to consider one of the below.

• Purchase a localized language version of Excel that matches your Regional


Options.
• Change your Control Panel Regional Options to match the language version of
Excel.
• Check the DiffEngineX option Ensure application works when Excel
language version not equal to Regional Settings.

Changing your Control Panel Regional Options is not recommended as it has wide
ranging effects.

Only check the DiffEngineX provided option if you encounter problems.

Figure 3.
Command Line Arguments

If command line arguments are supplied to DiffEngineX, it will compare workbooks


without its user interface being displayed. This can be useful if you wish to compare
multiple workbooks one after another using a series of commands stored in a *.bat
file.

You must first locate the DiffEngineX.exe file. Typically it will have the location
specified below.
C:\Program Files\Florencesoft\DiffEngineX\DiffEngineX.exe

Values are passed to DiffEngineX by means of switches. Each switch is prefixed with
a forward slash / and is separated from its associated value or values by a colon :.

Some switches are associated with a single value. Others are associated with a
comma separated list of values. File names and comma separated lists containing
white space characters must be enclosed with double quotation marks e.g.

The switch /sheets has been used to limit the comparison to the sheets Cash Flow,
Notes and Annual Fin St.

Although the examples shown here are split across several lines, ensure that each
individual command does not contain newlines or carriage returns.

To display a list of all supported switches enter the following


"C:\Program Files\Florencesoft\DiffEngineX\DiffEngineX.exe" /help
from the command prompt.

The only mandatory switches are /inbook1, /inbook2 and /report. Typically you will
want different cells to be color highlighted, identical changes grouped together
(when adjacent) and the results saved to disk. Your original input workbooks are
not modified by the color highlighting. The changes are made to copies. DiffEngineX
will never overwrite existing files. To ensure your commands are not interrupted
you should explicitly delete old reports beforehand. The below example achieves
this.
The colors used to indicate modified, deleted and added cell content can be
individually specified. If a color is not specified on the command line, the one used
by the user interface is taken. The example command below additionally specifies
that existing workbook color be removed using the switch /removeexistingcolor.

The available colors (1 - 56) are shown in the palette below.

Figure 4.

Switches
If a switch accepts a Boolean true or false value, then omitting the value is
equivalent to specifying true i.e. /colordifferences:true is the same as
/colordifferences, but not the same as /colordifferences:false.
Action when
Example Switch and Value Description Switch
Omitted
Path and file name of 1st book
/inbook1:"myworkbook1.xls" Mandatory
to compare.
Path and file name of 2nd book
/inbook2:"myworkbook2.xls" Mandatory
to compare.
Path and file name of output
/report:"mydiffreport.xls" Mandatory
difference report.
Path and file name of altered
copy of 1st book to output.
/outbook1:"coloredcopy1.xls" Different cells will be colored if No copy saved.
/colordifferences specified as
well.

Path and file name of altered


copy of 2nd book to output.
/outbook2:"coloredcopy2.xls" Different cells will be colored if No copy saved.
/colordifferences specified as
well.
Comma separated list of
alphabetical columns to
examine when aligning rows (A
Similar rows will
/alignrows:"A,B" - IV). A maximum of 5
not be aligned.
columns can be specified.
Typically either "A" or "A,B"
will be specified.

Comma separated list of


numerical rows to examine
when aligning columns. A Similar columns
/aligncolumns:"1" maximum of 5 rows can be will not be
specified. Only use this if all aligned.
your sheets have a distinct row
containing column headings.
All the matching
Specify this to limit
sheets in the 2
/sheets:"Sheet1,Summary,Inputs" comparisions to specific
workbooks will
sheets.
be compared.
If true, different cells will be
color highlighted. The
/colordifferences:true or false No action.
/outbook1 and 2 switches
must also be specified.
Remove unconditional fill color
from cells in workbook copies
/removeexistingcolor: true or false to make color highlighting No action.
clearer. Note /colordifferences
must also be specified.
Hidden sheets, rows and
columns are made visible in
/unhidesheetsrowscols: true or workbook copies so differences
No action.
false cannot be obscured. Note
/colordifferences must also be
specified.
n is integer (1 - 3). 1 hides
matching rows. 2 hides
matching rows except those
Matching rows
near differing rows. 3 hides no
/hidematchingrows:n will not be
rows. Note /colordifferences
hidden.
must also be specified. Takes
precedence over
/unhidesheetsrowscols.
Equivalent changes to adjacent
/compactchanges:true or false cells are grouped together in No action.
difference report.

Alternate lines in difference


/coloralternaterows:true or false No action.
report are colored.

Text and formulae differences


are highlighted at the
/colorprecise:true or false character level with color red No action.
in different report. Time
consuming option.
If true, formulae are listed
A1 style is used
using the A1 style in the
/stylea1:true or false when switch
difference report. If false,
omitted.
R1C1 is used.
If omitted, case
If true, strings are compared sensitive
/caseinsensitive:true or false
without regards to case. comparisions
are used.
If true, formulae are directly If omitted,
/compareformulae:true or false compared, rather than their formulae are
calculated end results. compared.

/ensureworksinternationally:true or If true, DiffEngineX will work DiffEngineX will


fail when
despite Control Panel Regional Regional
false Options differing from the Options do not
language version of Excel. equal Excel
language.
n is integer (1 - 56).
User interface
/modifiedcolor:n /colordifferences must be
color used.
specified as well.

n is integer (1 - 56).
User interface
/deletedcolor:n /colordifferences must be
color used.
specified as well.
n is integer (1 - 56).
User interface
/addedcolor:n /colordifferences must be
color used.
specified as well.

n is integer (1 - 56).
User interface
/alignrowcolor:n /colordifferences must be
color used.
specified as well.
n is integer (1 - 56).
User interface
/aligncolcolor:n /colordifferences must be
color used.
specified as well.

Displays the available


/help or /h or /? switches. Any other switch, if No action.
specified as well, is ignored.

Tutorial: How to Compare Two Excel Lists

A common business problem often concerns finding out what names and addresses
appear in one list but not another. After the new data has been identified it is useful
to be able to extract it into a new Excel workbook.

DiffEngineX can do the bulk of this type of work. Knowing a few Excel tricks and
what options to select in DiffEngineX can greatly improve the end results.

Consider the two lists shown below. Even though DiffEngineX has the capability to
align similar rows it needs some help from you first. This is because some of the
changes involve not just the vertical displacement of rows, but a reordering. In the
first list the "Dobbs, Bob" row is before the "Rivers, Doreen" row. In the second list
the order has been reversed.
Figure 5 - Two Lists

DiffEngineX will insert blank rows to get existing rows to match up, but it will not
reorder them.

To get around this problem you should ask Excel to sort your lists before using
DiffEngineX to compare them. (Sorting is an optional step. Your data may not
require it.)

Below we see our two original lists after Excel has sorted them on last and first
name. Alternatively we could have sorted them by their ID column.

Figure 6 - Two Lists Sorted by Excel

Step-by-Step Instructions

1. First Sort using Excel & Save:


Use Excel to open the two workbooks you want to compare. Click on any cell
in the first list. Now click Excel's Data menu (or tab in Excel 2007) and select
the Sort item. The Sort dialog will now appear. Sort by Last Name and then
by First Name. Hit OK. Now do the same for the second list. In your lists you
can sort on any combination of columns that uniquely identifies each row.
(Sorting is not always necessary.)

2. Save both your sorted workbooks (under different filenames if you prefer)
before closing them.

3. Start up DiffEngineX - Use Options, Extras & Align Rows:


Invoke DiffEngineX and click the Options button. In our example we can see
that some street addresses are in upper case and others in lower case. Here
we don't want such a trivial change to been counted as a modification and so
we select the Case Insensitive Comparisons checkbox. Click OK to dismiss
the dialog box.

4. Click the Extras button. In our example both our lists are small, but in real
life some lists may contain tens of thousands of rows and have hundreds of
differences between them. DiffEngineX uses color to highlight differences in
automatically made copies of the workbooks it compares. We don't want to
have to fish through thousands of rows just to see a few differences. Make
sure the Yes option is selected for Hide Matching Rows. Click OK to dismiss
the dialog box.

5. Select Align Rows on the main part of DiffEngineX's user interface. Ensure
the Color Differences box is checked. Use the Browse buttons to point to your
sorted Excel workbooks. Click the Start Comparison button.

6. We now have to tell DiffEngineX what columns uniquely identify each row. As
we previously sorted on Last Name & First Name we select columns B and C
before clicking the Add button. Hit OK to dismiss the dialog and start the
comparison.

7. The results are shown below in figure 7. We can see DiffEngineX has
correctly spotted the three new rows. However the matching rows are still in
this workbook. They are only hidden. (You can see that Excel is not showing
rows 1, 2, 4, 5 and 7.)

8. The Final Step: Separate the Wheat from the Chaff:


Select the Excel worksheet containing the color highlighted new rows. Click
Excel's Edit menu and select Go To. Click the Special... button. Click Visible
cells only. (If you are using Excel 2007, select the Home tab. Then select Go
To Special... from the Find & Select drop-down menu. Select Visible cells
only.) Hit OK. Select Edit--->Copy. You have now selected and copied just the
visible, new rows.

9. Create a new Excel workbook and use Edit--->Paste to copy across just the
new rows. You now have separated the new rows from the hidden matching
rows.

Note: For more complicated examples than shown here, rows may end up being
colored red, green or purple by default to indicate differences. The colors red and
green are used to indicate after row alignment one of two corresponding cells is
blank. Purple means a cell has content in both sheets. You will have to inspect both
the color highlighted sheets to find out all the differences.

Figure 7 - DiffEngineX Hides Matching Rows

Figure 8 - Use Excel's Edit--->Go To--->Special--->Visible cells only before Copy &
Paste

You might also like