Professional Documents
Culture Documents
Multi-Dimensional
Arrays in PL/SQL
Steven Feuerstein and John Beresniewicz
In this first of two articles appearing over the next several issues, Steven
Feuerstein and John Beresniewicz demonstrate the basic techniques of applying
multi-dimensional collections in your code. Next time, they’ll take a look at a
more complex scenario and bring the full power of Oracle9i collections to bear
on their solution.
March 2004
P
L/SQL developers are a generally whiny lot. We complain a lot, but then, Volume 11, Number 3
Function dimZplane
The Z dimension was the last in our stepwise declaration Studying this code gives an appreciation for the
of collection types (the third dimension). The dimZ_t type challenges of keeping indexing straight in multi-
is our 3-D collection, and for any given index value in the dimensional collections, and for the fact that all
Z dimension the element for that value will be of type dimensions are not equal even in what seems on the
dimY_t. That is, the element at any given value for surface to be a highly symmetrical requirement.
dimension Z is precisely the 2-D plane of interest defined
as a collection of the type returned by the function. Function dimXplane
Having learned an important complexity lesson in
FUNCTION dimZplane ( dimYplane, it shouldn’t be a surprise to find that our
array_in IN dimZ_t, third function to slice out a YZ plane given an X value
dimZval_in PLS_INTEGER -- fixed value of dimZ
) is yet more complex. In this case we do the following:
RETURN dimY_t 1. Loop over both the Z and Y dimensions from
IS
dimY_tbl dimY_t; beginning to end.
BEGIN 2. If an entry exists in the collection for the X value
IF array_in.EXISTS(dimZval_in)
THEN specified and current loop indexes for Z and Y, add
dimY_tbl := array_in(dimZval_in); the value of this entry to a 2-D array using the Z and
END IF;
RETURN dimY_tbl; Y indexes.
END dimZplane;
FUNCTION dimXplane (
So in this case, our function has a simple job to do: array_in IN dimZ_t
, dimXval_in PLS_INTEGER -- fixed value of dimX
Just find the proper element and return it. No muss, no )
fuss—precisely because this particular slicing corresponds RETURN dimY_t
IS
directly to the way that the multi-dimensional collection dimY_tbl dimY_t;
is constructed. If only it were always this simple. indx1 PLS_INTEGER;
indx2 PLS_INTEGER;
BEGIN
Function dimYplane—all dimensions are indx1 := array_in.FIRST;
2. If an entry exists for the given Y value at this Z indx2 := array_in (indx1).NEXT (indx2);
value, add the one-dimensional array of values END LOOP;
RETURN dimY_tbl;
FUNCTION dimYplane ( END dimXplane;
array_in IN dimz_t
, dimYval_in PLS_INTEGER -- fixed value of dimY
) Once again, the code reveals a certain structural
RETURN dimY_t symmetry, but is far from trivial (as compared to the first
IS
dimY_tbl dimY_t; slicing program for the X dimension).
indx PLS_INTEGER;
BEGIN
indx := array_in.FIRST; Conclusion
Multi-dimensional PL/SQL collections are extremely
WHILE indx IS NOT NULL
LOOP powerful data structures, offering tremendous flexibility.
IF array_in (indx) (dimYval_in).COUNT > 0 There is, however, a clear trade-off: Manipulation of
THEN
dimY_tbl (indx) := array_in (indx) (dimYval_in); elements within the various dimensions can be very
END IF; complex. Generally, the level of complexity corresponds
indx := array_in.NEXT (indx); Continues on page 16
I
N doing my two-day workshops over the past year, only one table in the FROM clause, the RBO’s processing
I’ve discovered that most companies continue to use a is a bit simpler. Let’s review this scenario first.
mix of both the Cost-Based Optimizer and the Rule- The RBO simply looks for the existence of indexes
Based Optimizer, pretty much separated by applications. on columns in the WHERE clause and picks the predicate
I’ll start this article with a quick review of the RBO, and to start processing based on the rules in Table 1. The
then I’ll introduce many of the theories I’ve encountered lowest rank wins. If two columns have indexes that come
over the years and conclude with a quick comparison of out to the same rank, the one with the newest creation
the RBO vs. the CBO. date wins.
If the SQL has two or more tables being joined, the
RBO: Review RBO will then make several passes through the WHERE
The Rule-Based Optimizer makes its decisions based on clause predicates to find the relationships and determine
the text of the SQL statement itself, the presence or what, if any, indexes exist on those relationships. It starts
absence of indexes, the order of the
tables in the FROM clause, and data Table 1. RBO rules.
dictionary information.
The RBO uses a set of 19 rules to Rank WHERE clause rule
1 ROWID = constant
make its decisions; they’re displayed
2 unique indexed column = constant
in Table 1. The fastest way to access 3 entire unique concatenated index = constant
any single row in an Oracle database 4 entire cluster key = cluster key of object in same cluster
is to supply a valid ROWID. This, of 5 entire cluster key = constant
6 entire nonunique concatenated index = constant
course, ranks the highest on the rule
7 nonunique index = constant
list. Full table scans rank the lowest, 8 entire noncompressed concatenated index >= constant
although a full table scan might be 9 entire compressed concatenated index >= constant
the best access method. 10 partial but leading columns of noncompressed concatenated index
11 partial but leading columns of compressed concatenated index
The RBO really has only a few
12 unique indexed column using the SQL statement BETWEEN or LIKE options
decisions to make with each SQL 13 nonunique indexed column using the SQL statement BETWEEN or LIKE options
statement. I’ve found that only the 14 unique indexed column < or > constant
data dictionary tends to have 15 nonunique indexed column < or > constant
16 sort/merge
“clustered” objects. Most objects
17 MAX or MIN SQL statement functions on indexed column
don’t have such a variety of different 18 ORDER BY entire index
types of indexes either. The RBO 19 full table scans
Example databases and some of the scripts used in this article come with permission from
Tim Gorman (www.Sagelogix.com) and Jonathan Lewis (www.jlcomp.demon.co.uk).
SELECT count(*)
from A, B, C
WHERE A.STATUS = B.STATUS
AND A.B_ID = B.ID
AND B.STATUS = 'OPEN'
AND B.ID = C.B_ID
AND C.STATUS = 'OPEN';
Theory 2: FROM clause table order matters Theory 3: Index creation date matters
Does the RBO make its driving table decisions from the Following close behind the example in Figure 3, the RBO
order of the FROM clause? Let’s look at the A, B, and C seemed to pick B over A as the driving table no matter
tables again. Notice in Figure 2 that the tables in the which side of the “=” condition the A and B predicates
FROM clause are in the order A, B, C. Let’s change the appeared on. It seemed to pick the B table because of the
order and see what happens to our explain plan. In the creation date of the indexes on the STATUS columns. In
following query, we’ll move the A table to the end of the this example, however, the RBO always picked the B table
FROM clause from the beginning: to drive off of in a nested loop, even with the indexes
being created at different times and with the B.STATUS
SELECT count(*) index dropped!
from B, C, A
WHERE A.STATUS = B.STATUS So what’s the answer? The RBO doesn’t seem
AND B.STATUS = 'OPEN' to be making decisions based solely on the existence
AND B.ID = C.B_ID
AND C.STATUS = 'OPEN' of indexes. It seems to pick the nested loop join
AND A.B_ID = B.ID; mechanism when there’s at least one index involved,
while picking the merge join mechanism when no
The results are shown in Figure 3. Notice that when indexes are involved.
the RBO starts with the A table, there’s no condition in In the A, B, C table example, it always picked the B
the WHERE clause that compares A and C directly. The table to drive off of until there were no indexes on either
RBO then moves to the B table and finds two different the B table or the C table. The RBO seems to favor driving
predicates, A.B_ID = B.ID and A.STATUS = B.STATUS.
It picks the STATUS column, not because of its position
in the WHERE clause (as some believe) but because it
tied on rule 7 and the index creation date broke the tie
in favor of the B table (see the output from the script
Index_Info.sql in Figure 4). When statistics have been run,
more information appears in this script (a topic for a
future article!). Since we’re using the RBO, we can’t have
statistics collected for the purposes of these examples.
Yes, the order of tables in the FROM clause has a
distinct and predictable affect on the RBO.
Let’s look at just two tables in the FROM clause.
Using the traditional EMP and DEPT, with only a primary
Figure 3. Different FROM clause with explain plan. Figure 5. Two-table FROM clause.
select sum(sales_tot)
from multi_key_tbl1 a, mutli_key_tbl2 b
where a.key1 = b.key1
and a.key2 = b.key2
and a.key3 = b.key3;
select sum(sales_tot)
from multi_key_tbl1 a, multi_key_tbl2 b
where a.key2 = b.key2
and a.key3 = b.key3
and a.key1 = b.key1;
L
ET’S begin by declaring the VARRAY as shown in the on the fly, within the parameter list of the calling
following code. Oracle’s VARRAY collection type function. This is one instance where conciseness doesn’t
provides PL/SQL with a traditional array—that is, sacrifice readability.
non-sparse and strongly typed.
Declaring the StringArrray type specification
CREATE OR REPLACE
TYPE StringList The object type specification in Listing 3 is reasonably rich
AS VARYING ARRAY (10000) OF VARCHAR2(2000) in member procedures and functions. It does not, and
/
need not, contain every process that you or any of your
The array limitation of 10000 is arbitrary, and serves fellow developers will ever want to perform upon an
only as a constraint—any attempt to extend beyond that array of strings. With Oracle9i Release 2, object types with
limit will raise the Oracle exception “ORA-06532: dependents can now be altered. This isn’t as convenient as
Subscript outside of limit.” The declaration doesn’t buy CREATE OR REPLACE (available only for object types
any apparent initialization. without dependents)—think of the difficulties of adding
The StringList type is useful by itself—just consider comments using the alter syntax—and there are bugs in
any procedure that takes an unknown number of the ALTER command that may not be fixed until version
parameters. Instead of having a long list of optional 10g, so it’s still very necessary to carefully plan your
parameters like the procedure in Listing 1, you can simply object type design, and to have a drop and re-create
declare a StringList parameter as in Listing 2, which is strategy in case a hierarchy of types have to be rebuilt.
convenient both within the procedure and when calling it. A second way of adding functionality is by
inheritance, which I think is preferable once an object type
has passed a preliminary shake-down phase. A brief
Listing 1. The traditional manner of handling a varying number discussion of inheritance appears at the end of this article.
of parameters.
What fun can we have with PL/SQL? One of my friends posed me set serveroutput on size 1000000
declare
this puzzle: Suppose it’s circa 1950 and you have $100 to spend x number; -- num of chicken
y number; -- num of sheep
at a livestock fair. You’re in the market to buy chickens, sheep, and z number; -- num of pigs
pigs. The number of animals that you have to buy is 100. Each begin
--check every value of pig in 1..100
chicken costs 10 cents, a sheep costs $2, and a pig costs $5. How For z in 1..100
loop
many of each should you buy? --For each value of pig check every value of sheep
--in 1..100
Solution: Mathematically, there can be two linear equations For y in 1..100
that can be developed with the aforesaid information with loop
of pigs is Z. So we get two equations: X + Y + Z = 100 and 0.1X + --If the equations are satisfied break the loop
If ((x+20*y+50*z =1000) and (x+y+z =100)) then
2Y + 5Z = 100 (or X + 20Y + 50Z = 1,000). Two linear equations dbms_output.put_line('The number of chickens is '||x);
with three variables can’t give an exact answer by solving them. dbms_output.put_line('The number of sheep is '||y);
dbms_output.put_line('The number of pigs is '||z);
However, an exact solution does exist. The other information we exit;
end if;
have is that X, Y, and Z are greater than 0 and are all positive end loop;
integers. We can try different values for X, Y, and Z and see end loop;
end loop;
whether the values of X, Y, and Z satisfy both of the equations. end;
/
However, it’s a Sisyphean task to find those values by trial and
error. To achieve this, we can use PL/SQL loops that achieve this And the answers are: 70 chickens, 19 sheep, 11 pigs. ▲
task in a few seconds.
We can check all combinations possible by running the Anunaya Shrivastava, PMP, OCP Financials, OCP Internet Developer, OTC,
values in loops and breaking the loop whenever the two criteria has been working with Oracle technology for more than seven years. He
are met, as I do here: works for HCL Enterprise Solutions. anunaya@hotmail.com.
Pinnacle, A Division of Lawrence Ragan Communications, Inc. ▲ 800-493-4867 x.4209 or 312-960-4100 ▲ Fax 312-960-4106
to the dimensional level at which the manipulation Steven Feuerstein is considered one of the world’s leading experts on the
takes place. Higher-level dimensional access produces Oracle PL/SQL language, having written nine books on PL/SQL, including
simpler code structures, as complex structures are Oracle PL/SQL Programming and Oracle PL/SQL Best Practices (all from
manipulated as a unit. Directly addressing and O’Reilly & Associates). Steven has been developing software since 1980
manipulating elements lower down in the dimension and serves as a senior technology advisor to Quest Software. His current
chain results in more complex code. It can be quite a projects include Swyg (www.Swyg.com) and the Refuser Solidarity
challenge to keep straight all of the different dimensional Network (www.refusersolidarity.net), which supports the Israeli military
indexes, and the order in which they must be specified. refuser movement. steven@stevenfeuerstein.com.
For access to current and archive content and source code, log in at www.pinnaclepublishing.com.
Phone: 800-493-4867 x.4209 or 312-960-4100 Copyright © 2004 by Lawrence Ragan Communications, Inc. All rights reserved. No part
Fax: 312-960-4106 of this periodical may be used or reproduced in any fashion whatsoever (except in the
Email: PinPub@Ragan.com case of brief quotations embodied in critical articles and reviews) without the prior
written consent of Lawrence Ragan Communications, Inc. Printed in the United States
of America.
Advertising: RogerS@Ragan.com
Oracle, Oracle 8i, Oracle 9i, PL/SQL, and SQL*Plus are trademarks or registered trademarks of
Editorial: FarionG@Ragan.com Oracle Corporation. Other brand and product names are trademarks or registered trademarks
of their respective holders. Oracle Professional is an independent publication not affiliated
Pinnacle Web Site: www.pinnaclepublishing.com with Oracle Corporation. Oracle Corporation is not responsible in any way for the editorial
policy or other contents of the publication.
Subscription rates This publication is intended as a general guide. It covers a highly technical and complex
subject and should not be used for making decisions concerning specific products or
applications. This publication is sold as is, without warranty of any kind, either express or
United States: One year (12 issues): $199; two years (24 issues): $348 implied, respecting the contents of this publication, including but not limited to implied
Other:* One year: $229; two years: $408 warranties for the publication, performance, quality, merchantability, or fitness for any particular
purpose. Lawrence Ragan Communications, Inc., shall not be liable to the purchaser or any
Single issue rate: other person or entity with respect to any liability, loss, or damage caused or alleged to be
caused directly or indirectly by this publication. Articles published in Oracle Professional
$27.50 ($32.50 outside United States)* reflect the views of their authors; they may or may not reflect the view of Lawrence Ragan
Communications, Inc. Inclusion of advertising inserts does not constitute an endorsement by
* Funds must be in U.S. currency. Lawrence Ragan Communications, Inc., or Oracle Professional.