You are on page 1of 6

A ClientDataSet in Every Database Application http://dn.codegear.

com/print/28876

A ClientDataSet in Every Database Application


By: Cary Jensen

Abstract: This article is the first in an extended series designed to explore the ClientDataSet. The
basic behavior of the ClientDataSet is described, and an argument is made for the extensive use of
ClientDataSets in most all database applications.
The ClientDataSet is a component that holds data in an in-memory table. Until recently, it was only
available in the Enterprise editions of Delphi and C++ Builder. Now, however, it is available in the
professional editions of these products, as well as Kylix. This article is the first in an extended series
designed to explore the capabilities and features of the ClientDataSet.
I have been playing with an idea for a while, and I wanted the title of this article to reflect this (with
my apologies to Herbert Hoover for the pathetic turn of his political promise of "two chickens in every
pot and a car in every garage"). In short, I believe that a very strong argument can be made for
including one ClientDataSet and a corresponding DataSetProvider for each TDataSet used in an
application. Doing so provides your user interface and runtime code with a consistent set of features
(filters, ranges, searches, and so forth) regardless of the data access technology being employed.
Actually I have two goals in this first of many articles detailing the ClientDataSet. The first is to set
forth the reasons why I believe that ClientDataSets should play a primary role in most database
applications. The second goal, and the one that I hope you find useful whether or not you accept my
arguments, is to provide a general introduction to the nature and features of the ClientDataSet.
It's this second goal that I will address first. Specifically, in order for my arguments to make sense, it
is essential to first provide an overview of the ClientDataSet, and how it interacts with a
DataSetProvider. This discussion will also serve as a primer for many of the technique-specific articles
that will follow in this series. After this introduction I will return to my first premise, explaining in
detail how you can improve your applications through the thoughtful use of ClientDataSets.

Introduction to the ClientDataSet


The ClientDataSet has been around for a while: Since Delphi 3 to be precise. But up until recently it
has only been available in the Client/Server or Enterprise editions of Delphi and C++ Builder. In these
editions the ClientDataSet was intended to hold data in a DataSnap (formerly called MIDAS) client
application. While many Enterprise edition developers did make extensive use of the ClientDataSet's
features in non-DataSnap application, that this component did not exist in the Profession edition
products made recommending its widespread employment unrealistic.
With Borland's introduction of dbExpress, which first appeared in Kylix 1.0, the ClientDataSet, and its
companion, the DataSetProvider, are now part of the Borland's Professional Edition RAD (rapid
application development) products, including Delphi 6, Kylix 2, and C++ Builder 6. Now all Borland
RAD developers have access to this powerful and flexible component (I'm not counting the Personal or
Open edition developers in this group, since those versions do not have the database-related
components in the first place).
With this in mind, let's now take a closer look at how the ClientDataSet works.
The ClientDataSet is a TDataSet descendant that holds data in memory in a table-like structure
consisting of rows (records) and columns (fields). Using the methods of the TDataSet class, a
developer can navigate, sort, search, filter, and edit the data held in memory. Because these
operations are performed on data stored in memory, they are very fast. For example, on a test
machine with 512 MB of RAM running an 850 MHz Pentium 3, an index was build on an integer field
containing random numbers of a 100,000 record table in just under one-half second. Once built, this
index can be used to perform near instantaneous searches and set ranges on this indexed field.
The ClientDataSet actually contains two data stores. The first, named Data, contains the current view

1 of 6 20.11.2008 17:31
A ClientDataSet in Every Database Application http://dn.codegear.com/print/28876

of the data in memory, including all changes to that data since it was loaded. For example, if a record
was deleted from the dataset, that record is absent from Data. Likewise, records added to the
ClientDataSet are visible in Data.
The second store, named Delta, represents the change log, and contains a record of those changes
that have been made to Data. Specifically, for each record that was inserted or deleted from Data,
there resides a corresponding record in Delta. For modified records it is slightly different. The change
log contains two records for each record modified in Data. One of these is a duplicate of the record
that was originally modified. The second contains the field-by-field changes made to the original
record.
The change log serves two purposes. First, the information in the change log can be used to restore
edits made to Data, so long as those changes have not yet been resolved to the underlying data
source. By default, this change log is always maintained, meaning that in most applications the
ClientDataSet is always caching updates.
The second role that the change log plays only applies to a ClientDataSet that is used in conjunction
with a DataSetProvider. In this role, the change log provides sufficient detail to permit the
mechanisms supported by the DataSetProvider to apply the logged changes to the dataset from which
the data was loaded. This process begins when you explicitly call the ClientDataSet s ApplyUpdates
method.
When a ClientDataSet is used to read and write data directly from a file, a DataSetProvider is not
used. In those cases, the change log is stored in this file each time you invoke the ClientDataSet s
SaveToFile method, and restored each time you call LoadFromFile (or if you open and close the
ClientDataSet when the FileName contains the name of the file). The change log is only cleared in this
scenario when you invoke MergeChangeLog or ClearChanges (this second method causes the changes
to be lost).
There are quite a few differences between how you use a ClientDataSet depending on whether or not
a DataSetProvider is employed. The following discussion focuses exclusively on the situation where a
ClientDataSet points to a DataSetProvider with its ProviderName property. Using a ClientDataSet
directly with files will be discussed in detail in a future article.

How a ClientDataSet and a DataSetProvider Interact


In order to use a ClientDataSet effectively you must understand how a ClientDataSet interacts with a
DataSetProvider. To illustrate this interaction I have created a Delphi project named
CDSLoadBehaviorDemo. The main form for this project is shown in the following figure. While I will
describe what this project does, it is best if you download this project from Code Central and run it.
That way you can observe first-hand the interaction.

Here is the basic setup. The ClientDataSet points to a DataSetProvider through its ProviderName
property, and the DataSetProvider refers to a TDataSet descendant through its DataSet property.
When you set the ClientDataSet s Active property to True or invoke its Open method, the

2 of 6 20.11.2008 17:31
A ClientDataSet in Every Database Application http://dn.codegear.com/print/28876

ClientDataSet makes a data packet request from the DataSetProvider. This provider then opens the
dataset to which it points, goes to the first record, and then scans through the records until it reaches
the end of the file. With each record it encounters the DataSetProvider encodes the data into a variant
array. This variant array is sometimes referred to as the data packet. When the DataSetProvider is
done scanning the records, it closes the dataset to which it points, and then passes the data packet to
the ClientDataSet.
You can see this behavior in the CDSLoadBehaviorDemo project. The DBGrid on the right-hand side of
the main form is connected to a data source that points to a TTable from which the DataSetProvider
gets its data. When you select ClientDataSet | Load from this project's main menu, you will literally
see the TTable's data being scanned in this DBGrid. Once the DataSetProvider gets to the last record
of the TTable, the TTable is closed and this DBGrid appears empty again, as shown in the following
figure.

Whether or not the scanning of the TTable is visible in the CDSLoadBehaviorDemo project is
configurable. Visible scanning is the default in this project, but because this visible scanning requires
so many screen repaints, the ClientDataSet takes quite a bit of time to load the not quite 1000 records
of the Items.db table (the table pointed to by the TTable). If you select View | View Table Loading to
uncheck this menu option, and select ClientDataSet | Load (if data is already loaded, you must first
select ClientDataSet | Unload), you will notice that these records load almost instantly. The actual load
time of a ClientDataSet depends on how much data is loaded.
Returning to a description of the ClientDataSet/DataSetProvider interaction, upon receiving the
variant array, the ClientDataSet unpacks this data into memory. The structure of this dataset is based
on metadata that the DataSetProvider encodes in the variant array. Even though the dataset to which
the DataSetProvider pointed may contain one or more indexes, the data packet contains no index
information. If you want indexes on the ClientDataSet, you must define or create them. ClientDataSet
indexes can be defined at runtime using the IndexDefs property, and this topic will be discussed at
length in a future article.
The ClientDataSet now behaves just like most any other opened TDataSet descendant. Its data can be
navigated, filtered, edited, indexed, and so forth. As pointed out earlier, any edits made to the
ClientDataSet will affect the contents of both the Data and Delta properties. In essence, these changes
are cached, and are lost if the ClientDataSet is closed without specifically telling it save the changes.
Changes are saved by invoking the ClientDataSet's ApplyChanges method.

Applying Changes to the Underlying Data Source


When you invoke ApplyChanges, the ClientDataSet passes Delta to the DataSetProvider. How the
DataSetProvider applies the changes depends on how you have configured it. By default, the
DataSetProvider will create an instance of the TSQLResolver class, and this class will generate SQL
statements that will be executed against the underlying data source. Specifically, the SQLResolver will
generate one SQL statement for each deleted, inserted, and modified record in the change log. Both
the UpdateMode property of the DataSetProvider, as well as the ProviderFlags property of the TFields

3 of 6 20.11.2008 17:31
A ClientDataSet in Every Database Application http://dn.codegear.com/print/28876

for the provider's dataset, dictate exactly how this SQL statement is formed. Configuring these
properties will be discussed in a future article.
If the dataset to which the DataSetProvider points is an editable dataset, you can alternatively set the
provider's ResolveToDataSet property to True. With this configuration, a SQLResolver is not used.
Instead, the DataSetProvider will edit the dataset to which it points directly. For example, the
DataSetProvider will locate and delete each record marked for deletion in the change log, and locate
and change each record marked modified in the change log.
If you download the CDSLoadBehaviorDemo project, you can see this for yourself. From your designer,
select DataSetProvider1 and set its ResolveToDataSet property to True. Next, run the project and load
the ClientDataSet. After making several changes to the data, select File | ApplyUpdates. Depending on
the speed of your computer, you may or may not actually see the DBGrid become active as the TTable
is edited. However, on most systems you will notice the DBNavigator buttons become active briefly as
a result of the editing process. (If your computer is too fast, and you cannot see the DBGrid or the
DBNavigator become active, you can assign an event handler to the AfterPost or AfterDelete event
handlers of Table1, and issue a MessageBeep or ShowMessage call. That way you will prove to yourself
that Table1 is being edited directly.)
There is a third option, which involves assigning an event handler to the DataSetProvider's
BeforeUpdateRecord event handler. This event handler will then be invoked once for each record in
the change log. You use this event handler to apply the changes in the change log programmatically,
providing you with complete control over the resolution process. Writing BeforeUpdateRecord event
handlers can be an involved process, and will be discussed in a future article.
When you invoke ApplyUpdates, you pass a single integer parameter. You use this parameter to
identify your level or tolerance for resolution failures. If you cannot tolerate any failures to resolve
changes to the underlying data source, pass the value 0 (zero). In this situation the DataSetProvider
starts a transaction prior to applying updates. If even a single error is encountered, the transaction is
rolled back, the change log remains unchanged, and the offending record is identified to the
ClientDataSet (by triggering its OnReconcileError event handler, if one has been assigned).
If you pass a positive integer when calling ApplyChanges, the transaction will be rolled back only if the
specified number of errors is exceeded. If fewer than the specified number of errors is encountered,
the transaction is committed and the failed records are returned to the ClientDataSet. Furthermore,
the applied records are removed from the change log, leaving only the changes that could not be
applied.
If the number of failures exceeds the specified number, the transaction is rolled back, the change log
is unchanged, and the records that could not be resolved are identified to the ClientDataSet as
described earlier.
You can also pass a value of 1 when invoking ApplyUpdates. In this situation no transaction is
started. Any records that can be applied are removed from the change log. Those whose resolution fail
will remain in the change log, and are identified to the ClientDataSet through its OnReconcileError
event handler.
That's basically how it works, although there are a number of variations that I have not considered.
For example, it is possible to limit how many records the ClientDataSet gets from the DataSetProvider
using the ClientDataSet's PacketRecords and FetchOnDemand properties. Similarly, you can pass
additional information back and forth between the ClientDataSet and the DataSetProvider using a
number of provided event handlers. Future articles in this series will describe how and when to use
these properties.

Using ClientDataSets Nearly Everywhere


Now that we've overviewed the basic workings of the ClientDataSet and DataSetProvider components,
let's return to the premise that I laid out at the beginning of this article. As I mentioned in the
introduction, a strong argument can be made for using a ClientDataSet/DataSetProvider combination
anytime data needs to be modified programmatically or displayed using data-aware controls.
There are three basic benefits to using ClientDataSet and DataSetProvider components for all data
access.

1. The combination provides a consistent set of data access features, regardless of which data
access mechanism you are using.

4 of 6 20.11.2008 17:31
A ClientDataSet in Every Database Application http://dn.codegear.com/print/28876

2. Their use provides a layer of abstraction in the data access layer, making future changes to the
data access mechanism easier to implement.

3. For local file-base systems (Paradox or dBase tables, for example), the ClientDataSet can greatly
reduce table and index corruption.

Let's consider each of these points separately.

A Consistent, Rich Feature Set


The ClientDataSet provides your applications with a consistent and powerful set of features
independent of the data access mechanism you are using. Among these features are an editable result
set, on-the-fly indexes, nested dataset, ranges, filters, cloneable cursors, aggregate fields, group state
information, and much, much more. Specifically, even if the data access mechanism that you are using
does not support a particular feature, such as aggregate fields or cloneable cursors, you have access to
them through the ClientDataSet.

A Layer of Abstraction
In addition to the features supported by ClientDataSet, the ClientDataSet/DataSetProvider
combination serves as a layer of abstraction between your application and the data access mechanism.
If at a later time you find that you must change the data access mechanism you are using, such as
switching from using the Borland Database Engine (BDE) to dbExpress, or from ADO to InterBase
Express, your user interface features and programmatic control of data can remain largely unchanged.
You simply need to hook the DataSetProvider to the new data access components, and provide any
necessary adjustment to your DataSetProvider properties and event handlers.
Some people don't like the fact that a ClientDataSet holds changes in cache until you call
ApplyUpdates. Fortunately, for those applications that need changes to be applied immediately you
can make a call to ApplyUpdates from the AfterPost and AfterDelete event handlers of the
ClientDataSet.

Reduced Corruption
For developers who are still using local file-based databases, such as Paradox or dBase, there is yet
another very powerful argument. Hooking a ClientDataSet/DataSetProvider pair to a TTable can
reduce the likelihood of table or index corruption to near zero.
Table and index corruption occurs when something goes wrong while accessing the underlying table.
Since a TTable component has an open file handle on the underlying table so long as the TTable is
active, this corruption happens all too often in many applications. When the data is extracted from a
TTable to a ClientDataSet, however, the TTable is active for only very short periods of time; during
loading and resolution, to be precise (assuming that you set the TTable's Active property to False,
leaving the activation entirely up to the DataSetProvider). As a result, in most applications, accessing
a TTable's data using a ClientDataSet/DataSetProvider combination reduces the amount of time that a
file handle is opened on the table to less than a fraction of one percent compared to what happens
when a TTable is used alone.

But It's Not for Every Application


While these arguments are compelling, I must also admit that this approach is not appropriate for
every application. That a ClientDataSet loads all of its data into memory makes its use much more
difficult when you are working with large amounts of data. There are work-arounds that you can use if
you point a ClientDataSet to, say, a multi-million record data source, but doing so sometimes requires
a fair amount of coding, thereby complicating the application.
For most applications, however, the combination of features provided by the ClientDataSet outweigh
the disadvantages. But even if you do not accept this argument, I think that you will find many
situations where the use of a ClientDataSet enhances your application's features, and simplifies your
efforts.

About the Author


Cary Jensen is President of Jensen Data Systems, Inc., a Texas-based training and consulting company
that won the 2002 Delphi Informant Magazine Readers Choice award for Best Training. He is the
author and presenter for Delphi Developer Days (www.DelphiDeveloperDays.com), an information-

5 of 6 20.11.2008 17:31
A ClientDataSet in Every Database Application http://dn.codegear.com/print/28876

packed Delphi (TM) seminar series that tours North America and Europe. Cary is also an award-
winning, best-selling co-author of eighteen books, including Building Kylix Applications (2001,
Osborne/McGraw-Hill), Oracle JDeveloper (1999, Oracle Press), JBuilder Essentials (1998,
Osborne/McGraw-Hill), and Delphi In Depth (1996, Osborne/McGraw-Hill). For information about
onsite training and consulting you can contact Cary at cjensen@jensendatasystems.com, or visit his
Web site at www.JensenDataSystems.com.
Click here for a listing of upcoming seminars, workshops, and conferences where Cary Jensen is
presenting.
Copyright ) 2002 Cary Jensen, Jensen Data Systems, Inc.
ALL RIGHTS RESERVED. NO PART OF THIS DOCUMENT CAN BE COPIED IN ANY FORM WITHOUT THE
EXPRESS, WRITTEN CONSENT OF THE AUTHOR.

Published on: 7/15/2002 5:19:25 PM


Server Response from: BDN9A

Copyright© 1994 - 2008 Embarcadero Technologies, Inc. All rights reserved.

6 of 6 20.11.2008 17:31