You are on page 1of 13

IBM Analytics

IBM InfoSphere Data Replication


(Change Data Capture)
Version 11.3.3.2
Apply Performance Update
Oronde J. Tucker
Page 1

Table of Contents
Introduction ............................................................................................................................................ 2
Configuration Description........................................................................................................................ 2
Test Methodology ............................................................................................................................... 2
Insert-only Workload ....................................................................................................................... 2
TPC-C workload ............................................................................................................................... 3
Notes: .............................................................................................................................................. 3
Change Data Capture Versions Used ................................................................................................ 3
Results .................................................................................................................................................... 3
Scalability Test with Parallelize by Table (Insert-only workload) ........................................................... 3
Scalability Test with Parallelize by Table (TPC-C workload) ................................................................... 5
Summary ................................................................................................................................................. 5
Appendix A: Machine and Environment Specifications............................................................................. 6
Source Machine Details (Insert-only workload) ................................................................................ 6
Source Machine Details (TPC-C workload) ........................................................................................ 6
Target Machine Details (both workloads) ............................................................................................ 6
Appendix B: Insert-only Workload Table Details ...................................................................................... 6
Appendix C: TPC-C Table Details .............................................................................................................. 9
Appendix D: TPC-C Transaction Details .................................................................................................. 10
TPC-C Transaction Table Distribution Details ......................................................................................... 10
References ............................................................................................................................................ 11
Notices .................................................................................................................................................. 12
Page 2

Introduction
IBM InfoSphere Data Replication Change Data 11.3.3.2 includes several performance enhancements.
This release includes increases in apply throughput and an overall reduction in target-side CPU
utilization. This article describes the benefits of these enhancements in greater detail.

Please note that the workload used in these tests is not representative of all workloads. The workload
used in these tests may not match your production workload1.

Configuration Description
Test Methodology
Two types of workloads were used for these test:

• A workload consisting of inserts to replicated tables.


• A workloads generated by running a TPC-C benchmark on the source database.

Tests used either DB2 on z/OS or Oracle on Linux as the source databases. In all cases, the source system
provided sufficient throughput to avoid being the bottleneck for the tests. Details of the environments
can be found in Appendix A: Machine and Environment Specifications.

Insert-
Insert-only Workload
This series of tests involved replicating transactions consisting exclusively of inserts to a target Oracle
database. The number of CDC apply connections was changed for each test. All tests used a single CDC
subscription.

Details:

• CDC for DB2 on z/OS was used as the source


• 100% Insert workload
• 10 Inserts per commit

1
Performance is based on measurements and projections using CDC benchmarks in a controlled
environment. The results that any user will experience will vary depending upon many factors,
including considerations such as the amount of multiprogramming in the user's job stream, the I/O
configuration, the storage configuration, the amount of CPU capacity available during processing, and
the workload processed.
Therefore, results may vary significantly and no assurance can be given that an individual user will
achieve results similar to those stated here. These results should be used for reference purposes only.
The test scenarios (hardware configuration and workloads) used in this document to generate performance data
are not considered ‘best performance case’ scenarios. Performance may be better or worse depending on the
hardware configuration, data set types and sizes, and the overall workload on the system.
The information contained in this document is distributed on an “AS IS” basis without any warranty
either expressed or implied. The use of this information or the implementation of any of these techniques is a
customer responsibility and depends on the customer’s ability to evaluate and integrate them into their
operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation,
there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt
these techniques to their own environments do so at their own risk.
Page 3

• 8 identical tables (described in Appendix B: Insert-only Workload Table Details)

Test Variants
• No fast-apply – A single apply connection
• Parallelize by Table Fast Apply mode with 4 apply connections and 12,000 Unit of Work size
• Parallelize by Table Fast Apply mode with 8 apply connections and 12,000 Unit of Work size

TPC-
TPC-C workload
This test involved running a TPC-C benchmark (http://www.tpc.org/tpcc/default.asp) test against the
source database and replicating its changes using CDC for Oracle. The test varied the number of CDC
apply connections in the target database. All tests used a single CDC subscription.

Details:

• CDC for Oracle was used as a source.

Test Variants
• No fast-apply – A single apply connection
• Parallelize by Table Fast Apply mode with 4 apply connections and 12,000 Unit of Work size
• Parallelize by Table Fast Apply mode with 16 apply connections and 12,000 Unit of Work size

Notes:
• The Fast Apply mode (Parallelize by Table) used for these tests has been selected for illustrative
purposes and may not be the most optimal mode for the TPC-C workload generated by the test.

Change Data Capture Versions


Versions Used
Comparisons were made against the baseline build of InfoSphere Data Replication Change Data
Replication 11.3.3.1 (Release 7) for Oracle as the source and target. This was compared with 11.3.3.2
initial2 release on the source and target.

Results
Scalability Test with Parallelize by Table (Insert-
(Insert-only workload)
This test compares the throughput3 of replicating with increasing number of fast-apply connections.
Figure 1 shows that the performance increasing from 18,194.6 to 26,861.1 rows per second with 8 apply
connections.

2
Samples were taken with internal builds prior to the final release of 11.3.3.2 Release 12.
3
Measurements of CPU utilization were not recorded for these tests
Page 4

Throughput Compared
30000
26861.13
25000

20000 19667.67
18194.58
15000
13674.5
10000 9170.92
6858.33
5000

0
1 4 8

11.3.3.1 11.3.3.2

Figure 1 – Throughput of Inserts with varying number of apply connections

Figure 2 Shows the relative gain in throughput as the number of apply connections increases to match
the number of tables. With 8 connections, there is a 48% increase in throughput compared with the
baseline.

Scalability Compared to Baseline


60%

50%

40%

30%

20%

10%

0%
0 1 2 3 4 5 6 7 8 9

Figure 2 - Relative throughput increases of Inserts


Page 5

Scalability Test with Parallelize by Table (TPC-


(TPC-C workload)
This test uses a mixed workload of inserts (33%), updates (64%) and deletes (3%). These transactions are
distributed amongst 9 tables with different structures4. The throughput gains are marginal5. However,
the target-side6 CPU savings range from 13% to 26%. Details are as show in Table 1 . Note that the
target-side CPU reduction is present with and without fast-apply enabled.

Number of Apply Target-side CPU Reduction


Connections

1 (No fast apply) 18%

4 13%

16 26%
Table 1- TPC-C Target-side CPU savings

Summary
IBM InfoSphere Data Replication Change Data Capture 11.3.3.2 provides measurable performance
improvements in terms of throughput and target-side CPU utilization.

4
Table layouts are described in Appendix C: TPC-C Table Details
5
Throughput gains were 3% for no fast apply, 5% with 4 apply connections and 6% with 16 apply connections
6
Detailed CPU measurements were limited to the target side.
Page 6

Appendix A: Machine and Environment Specifications


Source Machine Details (Insert-
(Insert-only workload)
• IBM InfoSphere Change Data Capture 10.2.1 for z/OS source was used.
• The CDC instance was configured on a zEC10 mainframe running DB2 for z/OS.

Source Machine Details (TPC-


(TPC-C workload)
• Intel Xeon CPU E5430 (2.66 GHz) 8 cores, 48268 MB RAM.
• Source database - Oracle version 11gR2.

Target Machine
Machine Details (both workloads)
• Power 740 with 256G RAM, 1000baseT network connection, 2X Power 7 3.55G 8 core CPUs (max
4 threads per core).
• Target database - Oracle version 11gR2

Appendix B: Insert-
Insert-only Workload Table Details
Listed below is the SQL statement used to create the table used for Insert-only workloads in Oracle. The
average row size was 682 bytes.

create table INSTAB001(

CUSTNO number(10) not null,

STATE char(2),

BRANCH01 varchar2(3) not null,

NAME01 varchar2(40),

NAME02 varchar2(10),

ADDRESS01 char(10),

ADDRESS02 char(5),

CITY01 char(3),

STATE01 char(32),

STATUS01 varchar2(1),

CRLIMIT01 number(7,0),

BALANCE01 number(3,0),

REPNO01 number(3,0),

BRANCH11 varchar2(3) not null,

NAME11 varchar2(10),
Page 7

NAME12 varchar2(40),

ADDRESS11 char(10),

ADDRESS12 char(5),

CITY11 char(3),

STATE11 char(32),

STATUS11 varchar2(1),

CRLIMIT11 number(7,0),

BALANCE11 number(3,0),

REPNO11 number(3,0),

BRANCH21 varchar2(3) not null,

NAME21 varchar2(10),

NAME22 varchar2(10),

ADDRESS21 char(50),

ADDRESS22 char(5),

CITY21 char(3),

STATE21 char(32),

STATUS21 varchar2(1),

CRLIMIT21 number(7,0),

BALANCE21 number(3,0),

REPNO21 number(3,0),

BRANCH31 varchar2(3) not null,

NAME31 varchar2(10),

NAME32 varchar2(10),

ADDRESS31 char(10),

ADDRESS32 char(5),

CITY31 char(3),

STATE31 char(32),

STATUS31 varchar2(1),

CRLIMIT31 number(7,0),
Page 8

BALANCE31 number(3,0),

REPNO31 number(3,0),

BRANCH41 varchar2(3) not null,

NAME41 varchar2(10),

NAME42 varchar2(40),

ADDRESS41 char(10),

ADDRESS42 char(5),

CITY41 char(3),

STATE41 char(18),

STATUS41 varchar2(1),

CRLIMIT41 number(7,0),

BALANCE41 number(3,0),

REPNO41 number(3,0),

BRANCH51 varchar2(3) not null,

NAME51 varchar2(10),

NAME52 varchar2(40),

ADDRESS51 char(10),

ADDRESS52 char(5),

CITY51 char(3),

STATE51 char(2),

STATUS51 varchar2(1),

CRLIMIT51 number(7,0),

BALANCE51 number(3,0),

REPNO51 number(3,0),

BRANCH61 varchar2(3) not null,

NAME61 varchar2(10),

NAME62 varchar2(40),

ADDRESS61 char(10),

ADDRESS62 char(5),
Page 9

CITY61 char(3),

STATE61 char(2),

STATUS61 varchar2(1),

CRLIMIT61 number(7,0),

BALANCE61 number(3,0),

REPNO61 number(3,0),

primary key( CUSTNO ));

Appendix C: TPC-
TPC-C Table Details
The following table provide an overview on the types of datatypes present in each of the tables being
used in the TPC-C workload tests.

Table Number Char VarChar2 Float Date Total


Columns

Customer 8 5 6 1 1 21

District 5 2 4 11

History 6 1 1 8

Item 3 2 5

New_Order 3 3

Oorder 7 1 8

Order_Line 8 1 1 10

Stock 6 10 1 17

Warehouse 3 2 4 9

Table 2 - TPC-C Table data types


Page 10

Appendix D: TPC-
TPC-C Transaction Details
The following table list the distribution of transaction sizes within the TPC-C test that was performed.

Transaction Size Distribution

0 to 5KB 0%

5 to 10KB 36%

10 to 20KB 5%

20 to 40KB 32%

40 to 80KB 16%

>80KB 10%

Table 3 - TPC-C Transaction size distribution

TPC-
TPC-C Transaction Table Distribution Details
The following table shows the distribution of operations by table within the TPC-C test that was
performed.

Transaction Type Distribution

NewOrder 45%

Payment 43%

OrderStatus 4%

Delivery 4%

StockLevel 4%

Table 4 - TPC-C Transaction Table Distribution


Page 11

References
• OLTP-Bench: An extensible testbed for benchmarking relational databases D. E. Difallah, A.
Pavlo, C. Curino, and P. Cudre-Mauroux. In VLDB 2014.
Page 12

Notices
© Copyright IBM Corporation 2016
All Rights Reserved.
IBM Canada
8200 Warden Avenue
Markham, ON
L6G 1C7
Canada

Neither this documentation nor any part of it may be copied or reproduced in any
form or by any means or translated into another language, without the prior
consent of the above mentioned copyright owner.

IBM makes no warranties or representations with respect to the content hereof


and specifically disclaims any implied warranties of merchantability or fitness for
any particular purpose. IBM assumes no responsibility for any errors that may
appear in this document. The information contained in this document is subject to
change without any notice. IBM reserves the right to make any such changes
without obligation to notify any person of such revision or changes. IBM makes
no commitment to keep the information contained herein up to date.

Performance is based on measurements and projections using standard IBM


benchmarks in a controlled environment. The actual throughput or performance
that any user will experience will vary depending upon many factors, including
considerations such as the amount of multiprogramming in the user's job stream,
the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results
similar to those stated here.

All performance data contained in this publication was obtained in the specific
operating environment and under the conditions described above and is
presented as an illustration only. Performance obtained in other operating
environments may vary, and customers should conduct their own testing

The information in this document concerning non-IBM products was obtained


from the supplier(s) of those products. IBM has not tested such products and
cannot confirm the accuracy of the performance, compatibility, or any other
claims related to non-IBM products. Questions about the capabilities of non-IBM
products should be addressed to the supplier(s) of those products.

IBM, the IBM logo and InfoSphere are trademarks or registered trademarks of
International Business Machines Corporation in the United States, other
countries, or both. Other company, product, or service names may be
trademarks or service marks of others. References in this publication to IBM
products or services do not imply that IBM intends to make them available in all
countries in which IBM operates.

You might also like