Professional Documents
Culture Documents
A Look Inside
Postgres Plus Advanced Server
http://www.enterprisedb.com
Executive Summary
For more than 20 years, PostgreSQL has earned the reputation as the
worlds most advanced open source database. EnterpriseDBs Postgres
Plus Advanced Server builds on the rock-solid PostgreSQL foundation to
deliver significant value-added features not available in the community
edition.
The capabilities added to PostgreSQL by EnterpriseDB address a wide
range of enterprise requirements, including application development
enhanced performance, runtime management, usability, and scalability. In
addition, EnterpriseDB provides 24x7 follow the sun technical support,
as well as training and professional services needed by application
developers, DBAs and global operations teams.
Organizations seeking to leverage the exceptional price/performance of an
enterprise-class open source database, including current users of
PostgreSQL, are well served by Postgres Plus Advanced Server and the
services designed to optimize and guarantee continued performance.
An in depth discussion targeted specifically to your organizations
requirements can be scheduled with an EnterpriseDB domain expert by
sending an email to sales@enterprisedb.com.
Overview
Performance
o DynaTune
o Query Optimizer Hints
o EDB*Loader
o Bulk SQL Operations
o Query Profiler
o Asynchronous Pre-Fetch
Management
o DBA Management Server
Performance
Postgres
Plus
Advanced
Servers
value-added
performance
enhancements are derived primarily from EnterpriseDBs extensive
investments in Oracle compatibility. Many of the features originally
created to support Oracle compatibility are used extensively today in nonOracle applications, including Optimizer Hints, EDB*Loader and Bulk SQL
Operations. For a detailed discussion of all the Oracle features available in
Postgres Plus Advanced Server refer to the EnterpriseDB white paper
titled: Delivering Oracle Compatibility.
DynaTune
Figure1:DynaTunesetupintheinstaller
PREPARE foo(int) AS
SELECT /*+ INDEX(emp emp_pk) */ empno, ename, dname
FROM emp, dept
WHERE emp.deptno = dept.deptno
AND empno = $1
EXECUTE foo(7369);
The above query produces the query plan below. The emp_pk index is
used to execute the query when a sequential scan of the emp table
would normally be preferred.
Figure2:Anoptimizedqueryplan
There are many types of query optimizer hints available to tailor the
execution plan to the needs of the application. The query can be optimized
for a plan that returns the initial rows quickly or to use a certain type of
join. These hints are all intended to give application developers additional
flexibility and control over query execution to assure consistent application
performance and end user experience.
The table below enumerates the query optimizer hints available in
Postgres Plus Advanced Server.
QueryHint
Description
ALL_ROWS
Optimizesforretrievalofallrowsoftheresultset.
CHOOSE
FIRST_ROWS
Optimizesforretrievalofonlythefirstrowoftheresultset.
FIRST_ROWS_10
Optimizesforretrievalofthefirst10rowsoftheresultsset.
FIRST_ROWS_100
Optimizesforretrievalofthefirst100rowsoftheresultset.
FIRST_ROWS_1000
Optimizesforretrievalofthefirst1000rowsoftheresultset.
FIRST_ROWS(n)
Optimizesforretrievalofthefirstnrowsoftheresultset.
FULL(table)
Performafullsequentialscanontable.
INDEX(table[index][])
Useindexontabletoaccesstherelation.
NO_INDEX(table[index][]) Donotuseindexontabletoaccesstherelation.
USE_HASH(table[])
Useahashjoincreatedfromthejoinattributesoftable.
NO_USE_HASH(table[])
USE_MERGE(table[])
Useamergesortjoinfortable.
NO_USE_MERGE(table[])
Donotuseamergesortjoinfortable.
USE_NL(table[])
Useanestedloopjoinfortable.
NO_USE_NL(table[])
Donotuseanestedloopjoinfortable.
Table1:QueryOptimizerHintsavailableinPostgresPlusAdvancedServer
EDB*Loader
Figure3:Copyvs.DirectLoadofdata
The differences between the Copy loading method and the Direct Path
loading method are contrasted in Figure 3.
SQL commands that return result sets consisting of large numbers of rows
may not operate as efficiently as possible due to the constant context
switching that must occur between the database server engine and the
procedural language code in order to transfer massive result sets.
This inefficiency can be mitigated by using a collection to gather result
sets into memory which the client can then access. Postgres Plus
Advanced Servers stored procedure language (SPL) includes the BULK
COLLECT clause which is used to specify the aggregation of large result
sets into an in-memory collection. By changing PL/pgSQL functions that
manipulate large cursors to SPL functions using BULK COLLECT,
performance can be increased by 100% or more.
To demonstrate the use of BULK COLLECT, consider the function
samplefunction1() in the code fragment below.
WHERE 1=0;
10
AS $$
DECLARE
arow accounts%ROWTYPE;
acur CURSOR FOR
SELECT * FROM accounts;
BEGIN
OPEN acur;
LOOP
FETCH acur INTO arow;
EXIT WHEN NOT FOUND;
INSERT INTO tab1 (aid, bid, abalance, filler)
VALUES (arow.aid, arow.bid, arow.abalance, arow.filler);
END LOOP;
CLOSE acur;
END;
$$ LANGUAGE 'plpgsql';
SELECT samplefunction1();
Query Profiler
11
The resulting list of SQL commands can be ordered on the basis of Total
Execution Time, Average Execution Time, or Statement Count (the
number of times a given SQL command was repeated) by selecting the
desired ranking from the Order By drop down list.
The types of SQL commands that appear can be filtered by checking the
appropriate boxes:
Include Inserts
Include Updates
Include Deletes
Include Selects
Include Others
Below is a sample of the output available from the Query Profiler in the
DBA Management Server.
12
Figure4:SampleQueryProfileroutput
Asynchronous Pre-Fetch
Striping is when the RAID controller splits the data into multiple
chunks and writes each chunk to a different disk drive.
Mirroring is when the RAID controller writes an exact copy of the
data to each disk drive.
13
Figure5:WithoutAsynchronousPreFetch
14
Figure6:WithAsynchronousPreFetch
The index is highly de-clustered that is the rows are not stored in
the same order as the index.
The index is narrow (one or a small number of columns comprise
the index).
A large number of rows are expected to be retrieved.
15
Management
Figure7:ThebrowserbasedDBAManagementServer
16
While simple operating system specific scheduling tools such as cron may
work adequately for simple development and testing purposes, their oneoff batch nature presents consistency problems when deployed in
enterprise infrastructures. The job scheduler in the DBA Management
Server provides a consistent browser based interface across all operating
systems allowing administrators to easily view all Postgres maintenance
jobs from a single console.
Figure8:TheDBAManagementServerJobScheduler
The DBA Monitoring Console provides DBAs a quick, easy-to-use, realtime dashboard into the system usage consumed by one or more
Postgres Plus and/or Postgres Plus Advanced Server database servers.
Graphs are presented for the following information:
17
Figure9:TheDBAMonitoringConsole
The graphs can be displayed on one screen as shown above, or they can
be displayed individually. A preference can be set for each graph to
display the information as a line or as a bar chart.
A chart can also be displayed showing the individual processes connected
to each database in the server and its current query.
18
Figure10:TheDBAMonitoringConsoleServerProcesses
The rate at which the information is refreshed can be adjusted for the
server processes chart as well as for the four graphs.
19
Figure11:DRITASystemWaitsView
20
Figure12:DRITAOutputfromedb$snap
21
STEP 2: Examining user activity details. Armed with the id of the potential
problem session, the DBA can then drill down for details, find the problem
and take corrective action.
Figure13:DRITAoutputfromsessid_rpt
Update Service
22
Figure14:DBAManagementServerUpdateService
Audit Logging
These audited activities include database access and usage along with
data creation, change, or deletion. The auditing system is based on
configuration parameters defined in the configuration file.
23
Figure15:DBAManagementServerAuditingReport
The following audit features are also available to help DBAs utilize, tune,
and manage the auditing system:
Database Migration
24
Object
Schema
Data
Constraints
Sequences
Synonyms
Indexes
Packages
StoredProcedures
Triggers
Functions
Views
Users,Roles
DatabaseLinks
ListPartitionedTables
RangePartitionedTables
Sybase
Microsoft
Oracle
Table2:DatabaseMigration
Usability
25
Postgres Plus Advanced Server is built upon the enterprise class features
of the standard PostgreSQL source base and adds to that foundation
multiple additional open source modules that can be classified into four
categories: Extensions, Security, Procedural Languages (Postgres
supports over 10 programming languages), and Contributing modules.
The tables below summarize all the component projects that comprise
Postgres Plus Advanced Server:
Category
Extensions
Feature
Description
Slony
Slonyisatriggerbasedasynchronousmastertomany
slavesreplicationsystem.
pgAdminIII
pgAdminIIIisagraphicaladministrationanddevelopment
platformforPostgres.
PostGIS
PostGISisanextensiontoPostgresenablingrobustspatial
databasecapabilities.
SSL
PostgresPlushasnativesupportforusingSSLconnections
toencryptclient/servercommunicationsforincreased
security.
Kerberos
PostgresPlussupportsKerberosversion5asan
authenticationmethod.
PL/pgSQL
PL/pgSQLisaloadableprocedurallanguagethatadds
controlstructurestotheSQLlanguage.
PL/Perl
PL/Perlisaloadableprocedurallanguagethatenables
userstowritePostgresPlusfunctionsinthePerl
programminglanguage.
PL/TCL
PL/TCLisaloadableprocedurallanguagethatenables
userstowritePostgresPlusfunctionsintheTCL
programminglanguage.
Security
Procedural
Languages
Contrib
Modules
26
PL/Java
PL/Javaisaloadableprocedurallanguagethatenables
userstowritePostgresPlusfunctionsintheJava
programminglanguage.
citext
Thecitextprovidesacaseinsensitivecharacterstringtype.
oid2name
oid2nameisautilityprogramthatallowsadministratorsto
examinethefilestructureofPostgresPlus.
pgbench
pgbenchisaprogramforrunningbenchmarktestson
PostgresPlus.
adminpack
adminpackprovidesanumberofsupportfunctionsused
byadministrationandmanagementtoolstoprovide
additionalfunctionality,suchasremotemanagement.
chkpass
chkpassimplementsadatatypeusedforstoringencrypted
passwords.
cube
cubeimplementsadatatypetorepresentmulti
dimensionalcubes.
dblink
dblinkisamodulethatallowsconnectionstoother
Postgresdatabasesfromwithinadatabasesession.
earthdistance
Theearthdistancemoduleprovidestwodifferent
approachestocalculatinggreatcircledistancesonthe
surfaceoftheEarth.
fuzzystrmatch
Thefuzzystrmatchmoduleprovidesseveralfunctionsto
determinesimilaritiesanddistancebetweenstrings.
hstore
hstoreimplementsadatatypeforstoringsetsof
(key,value)pairswithinasinglePostgresdatafield.
intagg
intaggprovidesanintegeraggregatorandanenumerator.
isn
isnimplementsdatatypesforinternationalproduct
numberingstandards
lo
loprovidessupportformanagingLargeObjects
ltree
ltreeimplementsadatatypeforrepresentinglabelsof
datastoredinahierarchicaltreelikestructure
pg_buffercache
pg_buffercacheallowsforexaminingthesharedbuffer
cacheinrealtime.
pg_freespacemap
pg_freespacemapallowsforexaminingthefreespacemap
pg_standby
pg_standbyallowsforthecreationofa"warmstandby"
databaseserver.
pg_trgm
pg_trgmmoduleimplementsfunctionsandoperatorsfor
determiningthesimilarityoftext.
pgcrypto
pgcryptoimplementscryptographicfunctionsforPostgres
pgrowlocks
pgrowlocksprovidesafunctiontoshowrowlocking
informationforaspecifiedtable.
27
pgstattuple
pgstattupleimplementsvariousfunctionstoobtaintuple
levelstatistics.
seg
segimplementsadatatypeforrepresentinglinesegments.
tablefunc
tablefuncprovidesvariousfunctionsthatreturntables.
tsearch2
tsearch2implementsfulltextindexing.
uuidossp
uuidosspprovidesfunctionstogenerateuniversally
uniqueidentifiers.
xml2
xml2providesXPathqueryingandXSLTfunctionality
Table3:PostgresPlusAdvancedServerAddOnModules
All the above features are integrated into a simple intuitive installer and
certified through a repeatable and extensive quality assurance process.
This ensures that all pieces encompassing many open source projects
function as a single, coherent, general, enterprise class database system.
Finally, the automatic update notification system delivered with Postgres
Plus Advanced Server ensures that all modules are kept up to date and in
synch.
28
PL Debugger
The PL Debugger gives developers and DBAs the ability to test and debug
Postgres Plus server-side programs using a graphical, dynamic interface.
The PL Debugger can be used on SPL stored procedures, functions, and
triggers as well as PL/pgSQL functions. Users can follow code paths, set
breakpoints, edit variables, and step into and over code.
Figure16:ThePLDebugger
29
appropriate breakpoint. From there the developer can step through the
trigger logic and watch how the data is treated.
This type of server side debugging can be extremely beneficial in
resolving problems where more than one application utilizes common
database code. Developers can debug trigger code in the context of each
referencing application.
Figure17:Serversidetriggerdebuggingintheapplicationcontext
30
One of the most common error scenarios is the handling of duplicate key
violations. The functional requirement in many applications is to add a new
record to a table but, if its key already exists, then the existing record
should be updated. With PostgreSQL, it is necessary to check if a record
exists before issuing an insert or update transaction, forcing this operation
to consume two statements. With the use of Transaction Error Recovery,
the application can issue an insert transaction and then only issue an
update transaction if the duplicate key is detected.
try{
stmt.executeUpdate("INSERT INTO TELLERS(TID, BID) " +
"VALUES(1000, 1)");
} catch (Exception e) {
stmt.executeUpdate("UPDATE TELLERS SET BID=1 " +
"WHERE TID=1000");
}
Scalability
GridSQL
31
Figure18:GridSQLArchitecture
Using GridSQL, database performance improves in a linear fashion as
additional commodity servers are added to the grid. To simplify
management, GridSQL includes a robust suite of graphical management
and monitoring tools that enable DBAs to monitor the entire grid from a
single console.
EnterpriseDB Corporation, 2009 All rights reserved.
EnterpriseDB and Postgres Plus are trademarks of EnterpriseDB Corporation. Other names may be trademarks of their respective owners.
http://www.enterprisedb.com
32
Figure19:GridSQLManagementConsole
Postgres Plus Advanced Server includes the Grid Monitoring Tool that
allows the DBA to monitor and tune the entire grid. In addition, it visually
depicts the traffic among and within the nodes so that the entire grid can
be balanced. The Postgres Plus Advanced Server version of GridSQL
also makes provisioning a snap, allowing nodes to be easily added to or
removed from the grid.
Infinite Cache
33
already in the buffer cache. This overwritten data may be required again at
some point where upon, the data must be read in from disk again. This
constant competition for a limited amount of memory is what can become
an upper bound on performance.
Thus, it follows that if there was an infinite amount of memory available,
performance can reach a more optimum level as there would be no need
to re-read data from disk once it already has a place in memory.
EnterpriseDBs Infinite Cache for Postgres Plus Advanced Server is built
on this concept. Available on Linux systems, Infinite Cache allows
Postgres Plus Advanced Server to utilize memory from networked,
commodity computers as a secondary memory cache (cache servers).
This provides a cache blade architecture using commodity hardware and
eliminates the need to upgrade the database server to a more expensive
platform such as Unix.
Figure20:InfiniteCacheArchitecture
With Infinite Cache in place, when a block of data is read from disk for the
first time, it is also stored on a cache server as well as in the shared buffer
cache. If at some later point in time, that same data block is requested
again, the shared buffer cache is searched first.
If the data block had been evicted from the shared buffer cache, the cache
servers are searched next instead of going to disk. Since the secondary
cache can be significantly larger than the shared buffer cache, chances
are that the data block still resides on a cache server, and the data block
can be retrieved from there, which still provides quicker retrieval than disk
34
access. Only if the data is not on any of the cache servers, is the disk read
for the data.
Note that this all happens transparent to the application no program
coding changes are required to take advantage of Infinite Cache.
To further increase the capacity of the secondary cache, Infinite Cache
provides the capability to compress the data to varying degrees before it is
placed on a cache server. Compression allows more data to fit in the
secondary cache and also reduces the amount of bandwidth required to
transmit the data over the network.
Infinite Cache also allows you to warm the cache servers that is
selected tables, with or without their indexes, can be pre-loaded into the
secondary cache before the application starts. This allows the database
server to immediately take advantage of the faster memory access over
disk reads if it is known in advance what tables the application will use.
Usage statistics can be obtained from each cache server to analyze
performance and efficiency.
Value
1051681421
1410538244
42544414583
5139685
126588
104
4
126588
0
5139530
155
1073741824
3047
32
109.077417
21.423743
1
1242367107
115
126588
1095
1.2.6
The following graph illustrates the dramatic performance increase that can
be realized on Postgres Plus Advanced Server with Infinite Cache enabled
as compared with running the same test with Infinite Cache turned off.
EnterpriseDB Corporation, 2009 All rights reserved.
EnterpriseDB and Postgres Plus are trademarks of EnterpriseDB Corporation. Other names may be trademarks of their respective owners.
http://www.enterprisedb.com
35
Figure21:InfiniteCachevs.NoInfiniteCache
Since Infinite Cache keeps as much previously read data in the secondary
cache as possible, the greatest performance advantage is gained for
applications that repeatedly access the same data. The following graph
compares two applications one which includes some updates (the top
pair of bars) and the other which is read-only (the bottom pair of bars). As
seen by observing the top bar of the bottom pair, Infinite Cache has its
most dramatic benefit for the application performing 100% SELECT
statements.
36
Figure22:Selectsvs.UpdatesWithandWithoutInfiniteCache
The final graph shows the performance increase gained for both update
and read-only applications when using Postgres Plus Advanced Server
with Infinite Cache over PostgreSQL. As can be expected, the most
dramatic performance gain is in the read-only application.
37
Figure23:PostgreSQLvs.PostgresPlusAdvancedServerWithInfiniteCache
Database Links
38
Database links also form the basis for a homogenous distributed database
system. An application can simultaneously access data in several
databases in a single distributed environment. For example, a client can
issue a single query on a local database and can retrieve joined data from
tables on remote databases. For client applications, the location and
platform of the databases are transparent.
Figure24:DatabaseLinking
39
Conclusion
AboutEnterpriseDB