Professional Documents
Culture Documents
Introduction to DB2
DATABASE 2 (DB2) for Linux, UNIX, and Windows is a relational data
server developed by IBM. Version 10.5, available since June 2013 is the
most current version of the product and the one on which we focus in
this book.
In this chapter, you learn about
• The history of DB2
• DB2 in the big data world
• DB2 for Linux, UNIX, and Windows product portfolio
• Syntax diagram conventions
Note
The term “Universal Database” or “UDB” was dropped from the name for
simplicity starting from DB2 version 9.
Note that new release numbers are not necessarily consecutive within a version.
For example, DB2 version 10 release 1 (DB2 10.1) was released in 2012, and
DB2 10.5 was released in 2013. There were no DB2 10.2, DB2 10.3 or DB2 10.4
in between. Going from DB2 10.1 to DB2 10.5 directly signifies major changes or
new features added in the product.
This is the situation where a company has many repositories of data but
is unable to understand it because it is isolated and assigned to
independent silos. Attempts to expose the data might require copying it
to a central location accessible by the company’s content management
system. This can be costly and time-consuming. One way to get up and
running quickly and discover and retrieve relevant big data is to use IBM
InfoSphere Data Explorer (Data Explorer). With this software, the
company can search and navigate big data in place from federated
sources.
Many customers would like to reduce the overall cost to maintain data in
the warehouse. DB2 has a feature called multi-temperature data
management, discussed in Chapter 7, “Working with Database Objects,”
that can help reduce storage costs. If the amount of data in the
warehouse that is seldom used and kept “just in case” is very large, and if
the company has already invested in a Hadoop cluster, an alternative to
the multi-temperature data management feature is to offload the seldom
used data to the Hadoop cluster. This lowers costs because this data is
moved to commodity hardware but is still available for querying; the
data in the Hadoop system is automatically replicated at the software
layer, providing for fault tolerance. For this customer need, BigInsights
is the product that can be used (see Figure 1.3).
Figure 1.3 Reducing costs with Hadoop by moving cold warehouse data to BigInsights
Figure 1.4 shows a BigInsights cluster on the left and several traditional
relational databases on the right.
Figure 1.4 Connectivity between Netezza, DB2 or other RDBMS, and BigInsights
As you can see from the figure, BigInsights can drive Relational Database
Management System (RDBMS) work by behaving as a database client
and retrieving information from an RDBMS. This is mainly performed
using JDBC and an open source language called Jaql
(http://www-01.ibm.com/software/data/infosphere/hadoop/jaql/). In
addition, the BigInsights web console also includes a database
import/export application. The primary objective of this support is to
provide dynamic transfers of data between BigInsights and the target
DBMS.
In addition, you can have DB2 LUW initiate a Jaql job on BigInsights
(DB2 as a client to BigInsights). DB2 includes sample user-defined
functions (UDFs) to support this.
With either approach, users can write queries and applications that
integrate data managed by BigInsights and an RDBMS or warehouse.
BigInsights Filtering, Aggregating, and Transforming Data for the Data
Warehouse
After storing raw data in BigInsights, firms can manipulate, analyze, and
summarize the data to gain new insights as well as feed downstream
systems. In this manner, both the original (raw) data and modified forms
are accessible for further processing.
One potential deployment approach involves using BigInsights as a
source for a data warehouse. BigInsights can sift through large volumes
of unstructured or semi-structured data, capturing relevant information
that can augment existing corporate data in a warehouse. Figure 1.5
illustrates such a scenario, which offers firms the ability to broaden their
analytic coverage without creating an undue burden for their existing
systems. When in the warehouse, traditional business intelligence and
query/report writing tools can work with the extracted, aggregated, and
transformed portions of raw data stored in BigInsights.
Figure 1.5 BigInsights filtering, aggregating, and transforming data for the data
warehouse
DB2 EDITIONS
DB2 for Linux, UNIX, and Windows (sometimes referred to as LUW) is
developed using the C/C++ language. More than 90 percent of the code
is common among these platforms. The remaining code is unique to take
full advantage of the underlying platform architecture; however, the
database functionality on all of these platforms is the same.
DB2 for z/OS, DB2 for VM/VSE, and DB2 for i use a different code base
than that used by DB2 LUW. Note, however, that the Linux operating
system extends across all of IBM’s servers: System x®, System p®, and
System z®. DB2 for Linux on all of these server platforms is the same.
Thus, DB2 for Linux on System z uses the same code base and is licensed
in the same way as DB2 for Linux on a System x (Intel®) platform.
Prior to DB2 10.5, each edition was built on top of the other by linking
modules or object files that contained additional functionality. The core
of the DB2 code was common across all editions. In this model each DB2
Edition had its own separate install image; if you wanted to upgrade
from one DB2 Edition to a higher-end one to gain extra functionality,
you had to install a separate image. Considering that some enterprises
have IT policies where any new software installation requires full
application testing and certification, this would incur a lot of
unnecessary delays.
In DB2 10.5, selected editions share a common install image. There is no
need to install new DB2 code if you want to upgrade an existing
installation to a higher-end edition. A simple license certificate (also
known as license key) update unlocks new capabilities in your existing
DB2 installation. This makes it easy to deploy and start taking advantage
of the more advanced DB2 editions in your enterprise. Chapter 3,
“Installing DB2,” provides more details about installing licenses.
There are seven editions and one feature offering in DB2 10.5:
• DB2 Advanced Enterprise Server Edition (AESE)
• DB2 Advanced Workgroup Server Edition (AWSE)
• DB2 Enterprise Server Edition (ESE)
• DB2 Workgroup Server Edition (WSE)
• DB2 Express Server Edition
• DB2 Express-C
• DB2 Developer Edition
• DB2 Advanced Recovery Feature
The DB2 10.5 offering portfolio gives you options to deploy DB2 in a
single-server or multi-server clustered environment. The underlying
technology in the portfolio can handle transactional processing,
analytical processing, and mixed transactional and warehouse
workloads. DB2 10.5 gives you incredible deployment flexibility without
having to compromise on functionality or performance.
Note
All functionality and tools offered in prior versions of DB2 for Linux, UNIX, and
Windows and InfoSphere Warehouse editions and features are consolidated and
available in the DB2 10.5 offerings.
Four DB2 editions, namely AESE, AWSE, ESE, WSE, share a common install
image. Applying the corresponding license certificate unlocks new capabilities or
disable capabilities in your existing DB2 installation.
Figure 1.6 gives you an overview of all seven DB2 editions. Let’s discuss
each of these editions starting with DB2 Express-C, the entry-level
offering at the bottom of the diagram.
Figure 1.6 DB2 10.5 Editions
DB2 Express-C
DB2 Express-C is a DB2 data server available for download and
deployment at no charge. The C stands for Community; this means it is
an edition for the community of database enthusiasts. It is an ideal data
server for small businesses and developers who develop database
applications for their clients. The core code of DB2 Express-C is the same
as the other DB2 editions. DB2 Express-C can be used in production or
in a commercial setting. In addition, businesses can embed and
distribute DB2 Express-C as part of their application also for free. DB2
Express-C does not impose limits on the number of instances per server
or the number of users. However, it does have some system resource
usage limitations. DB2 Express-C is available on Windows and Linux (on
x86, x86–64, and Power), and Solaris on x86–64 systems. At the time of
writing, it is also available on the Mac for DB2 10.1. It can utilize at most
16GB of memory and 2 processor cores. The memory resource limit is
significantly lifted compared to previous versions of DB2 Express-C.
You can obtain DB2 Express-C technical assistance through a free
community-based online forum. An optional renewable 12-month
subscription license can be purchased for DB2 Express-C to obtain IBM
DB2 Technical support (24/7) and also support for the SQL replication
and the High Availability and Disaster Recovery (HADR) functionality.
For further details, refer to the DB2 Express website:
www.ibm.com/software/data/db2/express-c/.
Note
All DB2 editions are only available in 64-bit with the exception of DB2 Express-C
and the DB2 Developer Edition, which also come in 32-bit for either Windows or
Linux platforms.
To find out more about what each DB2 edition offers, please refer to the DB2
Information Center at
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.licensing.d
oc/doc/r0053238.html.
Note
DB2 Connect license entitlement is only required when connecting from DB2
LUW to a host DB2 server, such as DB2 for z/OS; it is not required in the other
direction, for example, when DB2 for z/OS behaves as the client, and DB2 LUW
is the server. DB2 Connect license entitlement can be obtained from the DB2
Connect software or apply the DB2 Connect license certificate to a Data Server
Client installation.
Note
IBM InfoSphere Federation Server and IBM InfoSphere Data Replication were
formerly known as WebSphere Information Integrator.
>>-mandatory_field--------------------------------------><
>>-mandatory_field--+----------------+------------------><
'-optional_field-'
If an optional field appears above the main path, it means that it is the
default option.
Click here to view code image
.-default_field-.
>>-mandatory_field--+---------------+-------------------><
'-optional_field-'
>>-mandatory_field--+-mandatory_choice1-+---------------><
'-mandatory_choice2-'
An arrow returning to the left, above the main line, indicates an option
can be repeated. In this example, repeated options are separated by one
or more blanks.
Click here to view code image
.-----------------.
V |
>>-mandatory_field----repeatable_field-+----------------><
If the repeat arrow contains a comma, you must separate repeated items
with a comma.
Click here to view code image
.-,---------------.
V |
>>-mandatory_field----repeatable_field-+----------------><
Example 1:
Click here to view code image
>>----Word1----+----------------+-----------------------<><
'-optionalWord---'
Output 1: Word1
Output 2: Word1 optionalWord
The optionalWord token is optional and does not have to be included in
the statement.
Example 3:
Click here to view code image
.-,----------. .----Word2----.
V | V |
>>----Word1-------Variable--+-------------------------------<><
Output 1: Word1
Output 2: Word1 Variable1, Variable 2, Variable 3
Output 3: Word1 Word2
Output 4: Word1 Variable 1 Word2
You should now feel comfortable reading syntax diagrams in the DB2
documentation. Browse through the DB2 online documentation and
review some examples there.
Click here to view code image
>>-+-STOP--+-DATABASE MANAGER-+-+------------------------------->
| +-DB MANAGER-------+ |
| '-DBM--------------' |
'-db2stop--------------------'
>--+--------------------------------+--------------------------->
'-+----------------+--identifier-'
+-DBPARTITIONNUM-+
+-MEMBER---------+
'-CF-------------'
>--+----------------------------------+--+------------------+--->
'-FORCE-'
>--+---------------------------------+-------------------------->
'-DROP DBPARTITIONNUM--identifier-'
>--+-----------------------------------------------+------------>
'-FORCE--+------------------------------------+-'
'-+----------------+----identifier---'
+-DBPARTITIONNUM-+
+-MEMBER---------+
'-CF-------------'
>--+----------------------+------------------------------------><
'-QUIESCE--+---------+-'
'-minutes-'
CASE STUDY
John recently graduated from Pennsylvania State University, where he
learned DB2 as part of the IBM Academic Initiative program
(http://www-03.ibm.com/ibm/university/academic/pub/page/academi
c_initiative), a program that provides DB2 and other IBM software for
free to teach database skills.
While at school, he worked with a DB2 Enterprise Server Edition
installed on a System p machine at the university computer lab. He was
given SYSADM authority and was able to see connections from many
different clients to all created databases using the list applications
command. John wanted to develop a Java application using the JDBC
Type 4 driver, so he downloaded and installed on his laptop the 90-Day
Try-and-Buy version of DB2 Express Server Edition. With this edition,
he was able to build, test, and run his application and connect to the
database created on his laptop. Because the client and data server were
both on his laptop, he was dealing with a local client connection.
John wanted to test whether his application would work as a remote
client, so he used the client software that comes with DB2 Express to test
his application against the database he had created earlier on the
university’s System p machine. This also worked, and John was feeling
like a DB2 guru.
Eager to show his program to his colleagues, he emailed the executable
to his friend Peter, who had just bought a new laptop with Microsoft®
Windows installed. Peter detached the file and tried to run the
application against John’s database on the University’s pSeries server.
After spending a few hours trying to figure out why he couldn’t, he
dropped by John’s place. John realized that Peter had to download and
install either an IBM Data Server Runtime Client, an IBM Data Server
Client, or just the IBM Data Server JDBC and SQLJ driver, as he needed
the JDBC Type 4 driver on his laptop. Given that Peter was not going to
develop a new program nor administer a database, John asked Peter to
download just the IBM Data Server JDBC and SQLJ driver from the IBM
website—and after installation, voila! the program ran successfully.
After John’s graduation, he applied for a position at a medium-sized
company in his hometown. To prepare for his interview, John again
tested the program he had written against his laptop database, but the
Try-and-Buy evaluation period had expired. John figured that he would
always need DB2 on his laptop, so he decided to buy the permanent
license. When John received the license file after the purchase, he
installed it on his laptop with the command db2licm -a filename. With
this problem resolved, John demonstrated his program during the job
interview and was immediately hired.
The company John worked for was using DB2 Workgroup, His manager
wanted to expand and modify the application running on DB2 LUW to
also connect to a DB2 for z/OS host data server. In general, applications
built for DB2 LUW work fine with DB2 for z/OS. However there are
some commands and SQL that might not be compatible. John used “SQL
Reference for Cross-Platform Development”
(http://www.ibm.com/developerworks/data/library/techarticle/0206sq
lref/0206sqlref.html) to help identify and handle the incompatibilities.
John led the project and demonstrated that no changes were needed for
the application; it simply needed to also connect to the database on DB2
for z/OS. However DB2 Workgroup does not come with the DB2 Connect
software component, so the company would need to purchase this
software or get the DB2 Developer Edition, as it would be cheaper and
have most of the software for all DB2 editions. DB2 Developer is licensed
per developer. This company did not have that many developers, so this
option satisfied their needs. If the company decided to use DB2 in a
production environment, they would have to buy the appropriate edition
and license.
Given that the company was expanding, the marketing and sales team in
his company needed to make real-time decisions to be responsive and
competitive. John performed a study to collect requirements from these
teams and realized that structured and unstructured data would be
growing exponentially in the near term and that this would impact
performance. Moreover, new queries with analytic workloads would
cripple the existing system if left untouched. After his thorough study, he
recommended the IBM PureData System family to the Chief Technology
Officer (CTO). They reviewed the features and benefits of each PureData
System model and felt that the IBM PureData System for Operational
Analytics fit their needs perfectly.
John contacted IBM and participated in the PureExperience™ program.
His technical team and IBM performed a proof of concept with the
company’s applications and the PureData System. The optimized data
warehouse system satisfied what John and his company were looking for
• Leveraged the PureData System built-in expertise and received high
performance analysis of both historical and operational data
• Deployed the PureData System in hours and provided insights into new
areas of the business immediately
• Simplified and reduced IT life cycle and costs significantly
After a successful implementation of the IBM PureData system in his
company, John was happy with the results. However, he knows that to
keep competitive, his company needs to be ahead of the curve. John
stays on top of new offerings added to the IBM PureData family and
monitors IBM marketing programs to utilize the investment already
made and the skill set his team already has.
SUMMARY
This chapter gave you a brief history of DB2, from the past to the
present. Each DB2 release has brought breakthrough technologies and
new value propositions to customers. DB2 10.5 is not an exception. With
its BLU Acceleration capability, it provides significant response time
improvements in reporting and analytics workloads and significant
storage space savings.
The chapter also discussed the concepts of big data, cloud computing,
and smarter planet, and explained where DB2 fit in the IBM Big Data
platform. In this platform, we also talked about the IBM PureData
System for Transactions and IBM PureData System for Operational
Analytics, both of which use DB2 as their core. IBM PureData Systems
offer built-in expertise, integration by design, and a simplified
experience throughout the life cycle.
You have also seen the new DB2 10.5 product offering portfolio. You can
choose to start with DB2 editions with core functionality and limited
system resource capacity. As your business needs grow, you can easily
update to a more advanced edition with a seamless upgrade path that
might require a simple license key update to your existing installation.
This would unlock other functionalities in the product.
DB2 is a client-server data management product. This chapter also
discussed the types of clients available with DB2. You can use the
traditional full Data Server Client or the individual lightweight Data
Server drivers. Different drivers are available to provide different types
of application interface support, such as JDBC, SQLJ, ODBC, CLI, .NET,
OLE DB, PHP, and Ruby On Rails.
REVIEW QUESTIONS
1. Why is DB2 called “DATABASE 2?”
2. What DB2 for LUW technology is used in the IBM PureData System
for Transaction?
3. Can an application developed for DB2 Express Server Edition work
with DB2 Enterprise Server Edition?
4. Is DB2 Connect required to connect from a DB2 for z/OS client to a
DB2 for Linux, UNIX, and Windows server?
5. What is the name of the technology in DB2 for LUW that provides
memory-based column store?
6. With any edition of DB2 for LUW installed, is IBM InfoSphere
Federation Server entitlement needed to set up a federation environment
between a DB2 LUWserver and an Informix server?
7. Why are there different models of PureData System?
8. Does DB2 for Linux, UNIX, and Windows have one single file that is
used for installation in any of these platforms?
9. What does the DB2 database partitioning enable you to do?
10. What should you do when your Try-and-Buy license period expires
and you would like to buy a permanent license?
11. Which of the following products is the minimum required on the
Windows client to run a DB2 application accessing a DB2 database on
UNIX?
A. DB2 Enterprise
B. DB2 Express-C
C. DB2 Connect
D. IBM Data Server Runtime Client
12. Which of the following products is the minimum required to run a
DB2 application using JDBC Type 4?
A. IBM Data Server Client
B. IBM Data Server Runtime Client
C. Data Server Driver for JDBC and SQLJ
D. Data Server Client
13. Which of the following DB2 editions can I choose from if I need
warehouse and data partition functionality?
A. DB2 Workgroup
B. DB2 Express
C. DB2 Enterprise
D. DB2 Advanced Workgroup
14. Which of the following DB2 Edition does not include SQL replication
technology?
A. DB2 Workgroup
B. DB2 Express-C
C. DB2 Express
D. DB2 Enterprise
15. Which of the following DB2 edition does not have system resource
usage limitation?
A. DB2 Workgroup
B. DB2 Advanced Workgroup
C. DB2 Express
D. DB2 Enterprise
16. In a small start-up software development company, there are three
developers developing and testing their existing Oracle application to
also support DB2 for LUW. Due to the tight software release cycle, they
need to install DB2 on multiple servers to leverage their testing system in
parallel. Which of the following products fits their need the most?
A. DB2 Enterprise
B. DB2 Advanced Enterprise
C. DB2 Developer
D. DB2 Advanced Workgroup
17. A software development company would like to test an application
that connects to both DB2 for LUW as well as DB2 for z/OS. Which of
the following would suit its needs the best?
A. DB2 Enterprise Server Edition
B. DB2 Workgroup Server Edition
C. DB2 Connect Enterprise Edition
D. DB2 Developer Edition
18. Which of the following software can run on a System z server?
A. DB2 for Linux, UNIX, and Windows
B. DB2 for iSeries
C. DB2 Connect
D. Data Server Runtime Client
19. Which of the following is the most suitable client for application
distribution in a mass deployment scenario?
A. IBM Data Server Driver Package
B. IBM Data Server Driver for ODBC and CLI
C. IBM Data Server Runtime Client
D. IBM Data Server Client
20. Which IBM data server client provides support for JDBC and SQLJ
applications? (Choose all that apply.)
A. IBM Data Server Driver Package
B. IBM Data Server Driver for JDBC and SQLJ
C. IBM Data Server Runtime Client
D. IBM Data Server Client