You are on page 1of 11

10/20/12

Queries

Queries
part of SQL for Web Nerds by Philip Greenspun

If you start up SQL*Plus, you can start browsing around immediately with the SELECT statement. You don't even need to define a table; Oracle provides the built-in d a table for times when you're interested in a constant or a function: ul
SL slc 'el Wrd fo da; Q> eet Hlo ol' rm ul 'ELWRD HLOOL ---------HloWrd el ol SL slc 22fo da; Q> eet + rm ul 22 + --------4 SL slc ssaefo da; Q> eet ydt rm ul SSAE YDT --------19-21 990-4

... or to test your knowledge of three-valued logic (see the "Data Modeling" chapter):
SL slc 4NL fo da; Q> eet +UL rm ul 4NL +UL
philip.greenspun.com/sql/queries.html 1/11

10/20/12

Queries

---------

(any expression involving NULL evaluates to NULL). There is nothing magic about the d a table for these purposes; you can compute functions using the b o r table instead of d a : ul bad ul
slc ssae22aa20 -)fo bor; eet ydt,+,tn(, 1 rm bad SSAE YDT 22AA20-) + TN(,1 ----- ----- ---------- ----- ----19-11 990-4 4 31196 .4525 19-11 990-4 4 31196 .4525 19-11 990-4 4 31196 .4525 19-11 990-4 4 31196 .4525 .. . 19-11 990-4 19-11 990-4 19-11 990-4 4 31196 .4525 4 31196 .4525 4 31196 .4525

500rw slce. 51 os eetd

but not everyone wants 55010 copies of the same result. The d a table is predefined during Oracle installation and, though it is just a plain old ul table, it is guaranteed to contain only one row because no user will have sufficient privileges to insert or delete rows from d a . ul

Getting beyond Hello World


To get beyond Hello World, pick a table of interest. As we saw in the introduction,
slc *fo ues eet rm sr;

would retrieve all the information from every row of the u e stable. That's good for toy systems but in any production system, you'd be better sr off starting with
SL slc cut* fo ues Q> eet on() rm sr; CUT* ON() --------philip.greenspun.com/sql/queries.html 2/11

Queries

75 32

You don't really want to look at 7352 rows of data, but you would like to see what's in the users table, start off by asking SQL*Plus to query Oracle's data dictionary and figure out what columns are available in the u e stable: sr
SL dsrb ues Q> ecie sr Nm ae Nl? ul Tp ye -------------------- ---------------- ---- -UE_D SRI NTNL NME(8 O UL UBR3) FRTNMS IS_AE NTNL VRHR(0) O UL ACA210 LS_AE ATNM NTNL VRHR(0) O UL ACA210 PI_AE RVNM NME(8 UBR3) EAL MI NTNL VRHR(0) O UL ACA210 PI_MI RVEAL NME(8 UBR3) EALBUCN_ MI_ONIGP CA() HR1 PSWR ASOD NTNL VRHR(0 O UL ACA23) UL R VRHR(0) ACA220 O_AAINUTL NVCTO_NI DT AE LS_II ATVST DT AE SCN_OLS_II EODT_ATVST DT AE RGSRTO_AE EITAINDT DT AE RGSRTO_P EITAINI VRHR(0 ACA25) AMNSRTRP DIITAO_ CA() HR1 DLTDP EEE_ CA() HR1 BNE_ ANDP CA() HR1 BNIGUE ANN_SR NME(8 UBR3) BNIGNT ANN_OE VRHR(00 ACA240)

The data dictionary is simply a set of built-in tables that Oracle uses to store information about the objects (tables, triggers, etc.) that have been defined. Thus SQL*Plus isn't performing any black magic when you type d s r b ; it is simply querying u e _ a _ o u n , a view of some ecie srtbclms of the tables in Oracle's data dictionary. You could do the same explicitly, but it is a little cumbersome.
clm fnytp fra a0 oun ac_ye omt 2 slc clm_ae dt_ye| ''| dt_egh| ''a fnytp eet ounnm, aatp | ( | aalnt | ) s ac_ye fo ue_a_oun rm srtbclms weetbenm ='SR' hr al_ae UES odrb clm_d re y ouni;

Here we've had to make sure to put the table name ("USERS") in all-uppercase. Oracle is case-insensitive for table and column names in
3/11

10/20/12

Queries

queries but the data dictionary records names in uppercase. Now that we know the names of the columns in the table, it will be easy to explore.

Simple Queries from One Table


A simple query from one table has the following structure: the select list (which columns in our report) the name of the table the where clauses (which rows we want to see) the order by clauses (how we want the rows arranged) Let's see some examples. First, let's see how many users from MIT are registered on our site:
SL slc eal Q> eet mi fo ues rm sr weeeallk 'mteu; hr mi ie %i.d' EAL MI ----------------------------pigmteu hl@i.d ad@aionamteu nyclfri.i.d bnmteu e@i.d .. . wlmnlsmteu ola@c.i.d gos@i.d hmymteu hlmteu a@i.d .. . jeremteu pac@i.d rcmn@lmmteu ihodau.i.d ad_o@i.d nyromteu kvmteu o@i.d fec@i.d lthmteu ladnmteu sno@i.d pzmteu s@i.d piga.i.d hl@imteu pigmrin.imteu hl@atgya.i.d ad@aioni.i.d nyclfrnamteu t@i.d ymteu taasmteu edm@i.d
philip.greenspun.com/sql/queries.html 4/11

10/20/12

Queries

6 rw slce. 8 os eetd

The e a l l k ' m t e u says "every row where the email column ends in 'mit.edu'". The percent sign is Oracle's wildcard character for mi ie %i.d' "zero or more characters". Underscore is the wildcard for "exactly one character":
SL slc eal Q> eet mi fo ues rm sr weeeallk '_@i.d' hr mi ie __mteu; EAL MI ----------------------------kvmteu o@i.d hlmteu a@i.d .. . bnmteu e@i.d pzmteu s@i.d

Suppose that we notice in the above report some similar email addresses. It is perhaps time to try out the ORDER BY clause:
SL slc eal Q> eet mi fo ues rm sr weeeallk 'mteu hr mi ie %i.d' odrb eal re y mi; EAL MI ----------------------------ad@aionamteu nyclfri.i.d ad@aioni.i.d nyclfrnamteu ad_o@i.d nyromteu .. . bnmteu e@i.d .. . hlmteu a@i.d .. . piga.i.d hl@imteu pigmrin.imteu hl@atgya.i.d pigmteu hl@i.d
philip.greenspun.com/sql/queries.html 5/11

10/20/12

Queries

Now we can see that this users table was generated by grinding over pre-ArsDigita Community Systems postings starting from 1995. In those bad old days, users typed their email address and name with each posting. Due to typos and people intentionally choosing to use different addresses at various times, we can see that we'll have to build some sort of application to help human beings merge some of the rows in the users table (e.g., all three occurrences of "philg" are in fact the same person (me)).

Restricting results
Suppose that you were featured on Yahoo in September 1998 and want to see how many users signed up during that month:
SL slc cut* Q> eet on() fo ues rm sr weergsrto_ae> '980-1 hr eitaindt = 19-90' adrgsrto_ae<'981-1; n eitaindt 19-00' CUT* ON() --------90 2

We've combined two restrictions in the WHERE clause with an AND. We can add another restriction with another AND:
SL slc cut* Q> eet on() fo ues rm sr weergsrto_ae> '980-1 hr eitaindt = 19-90' adrgsrto_ae<'981-1 n eitaindt 19-00' adeallk 'mteu; n mi ie %i.d' CUT* ON() --------3 5

OR and NOT are also available within the WHERE clause. For example, the following query will tell us how many classified ads we have that either have no expiration date or whose expiration date is later than the current date/time.
slc cut* eet on() fo casfe_d rm lsiidas weeeprs> ssae hr xie = ydt o eprsi nl; r xie s ul
philip.greenspun.com/sql/queries.html 6/11

Subqueries
You can query one table, restricting the rows returned based on information from another table. For example, to find users who have posted at least one classified ad:
slc ue_d eal eet sri, mi fo ues rm sr wee0<(eetcut* hr slc on() fo casfe_d rm lsiidas weecasfe_d.sri =uesue_d; hr lsiidasue_d sr.sri) UE_DEAL SRI MI ----- ---------------------- ----------------445tmmto.o 28 w@eercm 449tuga@ctcuhc.d 28 rnhues.scioeu 439rcrocraa@b.s.d 28 iad.avjlksmueu 433gnft@t.e 29 o2oogent 439rbhwi.rcm 29 o@aair.o 443sea9i.ecmcm 25 tfn@xnto.o 436slemnpnnt 24 ivra@o.e 413gle@elyneu 25 alnwsea.d .. .

Conceptually, for each row in the u e stable Oracle is running the subquery against c a s f e _ d to see how many ads are associated sr lsiidas with that particular user ID. Keep in mind that this is only conceptually; the Oracle SQL parser may elect to execute this query in a more efficient manner. Another way to describe the same result set is using EXISTS:
slc ue_d eal eet sri, mi fo ues rm sr weeeit (eet1 hr xss slc fo casfe_d rm lsiidas weecasfe_d.sri =uesue_d; hr lsiidasue_d sr.sri)

This may be more efficient for Oracle to execute since it hasn't been instructed to actually count the number of classified ads for each user, but only to check and see if any are present. Think of EXISTS as a Boolean function that 1. takes a SQL query as its only parameter

10/20/12

Queries

2. returns TRUE if the query returns any rows at all, regardless of the contents of those rows (this is why we can use the constant 1 as the select list for the subquery)

JOIN
A professional SQL programmer would be unlikely to query for users who'd posted classified ads in the preceding manner. The SQL programmer knows that, inevitably, the publisher will want information from the classified ad table along with the information from the users table. For example, we might want to see the users and, for each user, the sequence of ad postings:
slc uesue_d ueseal casfe_d.otd eet sr.sri, sr.mi, lsiidaspse fo ues casfe_d rm sr, lsiidas weeuesue_d=casfe_d.sri hr sr.sri lsiidasue_d odrb ueseal pse; re y sr.mi, otd UE_DEAL SRI MI PSE OTD ----- --------------------------- ----------------- ----346124.20cmuev.o 90 01010@opsrecm 19-93 980-0 346124.20cmuev.o 90 01010@opsrecm 19-00 981-8 346124.20cmuev.o 90 01010@opsrecm 19-00 981-8 382124.61cmuev.o 94 01425@opsrecm 19-70 980-2 382124.61cmuev.o 94 01425@opsrecm 19-70 980-6 382124.61cmuev.o 94 01425@opsrecm 19-21 981-3 .. . 424yeiepr.o 18 m@ntotcm 19-12 980-5 424yeiepr.o 18 m@ntotcm 19-21 980-8 424yeiepr.o 18 m@ntotcm 19-30 980-8 339zuao@s.e 58 hpnvuant 19-21 981-0 339zuao@s.e 58 hpnvuant 19-21 981-0 339zuao@s.e 58 hpnvuant 19-21 981-0

Because of the JOIN restriction, w e e u e s u e _ d = c a s f e _ d . s r i , we only see those users who have posted at least hr sr.sri lsiidasue_d one classified ad, i.e., for whom a matching row may be found in the c a s f e _ d table. This has the same effect as the subquery above. lsiidas The o d r b u e s e a l p s e is key to making sure that the rows are lumped together by user and then printed in order of re y sr.mi, otd ascending posting time.

OUTER JOIN
Suppose that we want an alphabetical list of all of our users, with classified ad posting dates for those users who have posted classifieds. We
philip.greenspun.com/sql/queries.html 8/11

10/20/12

Queries

can't do a simple JOIN because that will exclude users who haven't posted any ads. What we need is an OUTER JOIN, where Oracle will "stick in NULLs" if it can't find a corresponding row in the c a s f e _ d table. lsiidas
slc uesue_d ueseal casfe_d.otd eet sr.sri, sr.mi, lsiidaspse fo ues casfe_d rm sr, lsiidas weeuesue_d=casfe_d.sri() hr sr.sri lsiidasue_d+ odrb ueseal pse; re y sr.mi, otd .. . UE_DEAL SRI MI PSE OTD ----- --------------------------- ----------------- ----570drgrmnsrn.o 29 bae@idpigcm 341dru@ctitlcm 76 bansd.ne.o 571drne@ls.e 29 benrfahnt 417drn@replo.l 77 bozfe.obxp 326drueetrnt 79 bos@ne.e 418drw@yehgwynt 77 boncbriha.e 395drw@ndncm 68 bonuie.o 19-30 980-5 395drw@ndncm 68 bonuie.o 19-31 980-0 323ds1@mz.e 48 b17aaent 572dskrk@ao.o 29 biosiyhocm .. .

The plus sign after c a s f e _ d . s r i is our instruction to Oracle to "add NULL rows if you can't meet this JOIN constraint". lsiidasue_d

Extending a simple query into a JOIN


Suppose that you have a query from one table returning almost everything that you need, except for one column that's in another table. Here's a way to develop the JOIN without risking breaking your application: 1. 2. 3. 4. 5. add the new table to your FROM clause add a WHERE constraint to prevent Oracle from building a Cartesian product hunt for ambiguous column names in the SELECT list and other portions of the query; prefix these with table names if necessary test that you've not broken anything in your zeal to add additional info add a new column to the SELECT list

Here's an example from Problem Set 2 of a course that we give at MIT (see http://www.photo.net/teaching/psets/ps2/ps2.adp). Students build a conference room reservation system. They generally define two tables: r o sand r s r a i n . The top level page is supposed to show a om eevtos user what reservations he or she is current holding:
9/11

10/20/12

Queries

user what reservations he or she is current holding:


slc ro_d sattm,edtm eet omi, tr_ie n_ie fo rsrain rm eevtos weeue_d=3 hr sri 7

This produces an unacceptable page because the rooms are referred to by an ID number rather than by name. The name information is in the r o stable, so we'll have to turn this into a JOIN. om Step 1: add the new table to the FROM clause

slc ro_d sattm,edtm eet omi, tr_ie n_ie fo rsrain,ros rm eevtos om weeue_d=3 hr sri 7

We're in a world of hurt because Oracle is now going to join every row in r o swith every row in r s r a i n where the u e _ d om eevtos sri matches that of the logged-in user. Step 2: add a constraint to the WHERE clause
slc ro_d sattm,edtm eet omi, tr_ie n_ie fo rsrain,ros rm eevtos om weeue_d=3 hr sri 7 adrsrain.omi =rosro_d n eevtosro_d om.omi

Step 3: look for ambiguously defined columns Both r s r a i n and r o scontain columns called "room_id". So we need to prefix the r o _ dcolumn in the SELECT list with eevtos om omi "reservations.". Note that we don't have to prefix s a t t m and e d t m because these columns are only present in r s r a i n . tr_ie n_ie eevtos
slc rsrain.omi,sattm,edtm eet eevtosro_d tr_ie n_ie fo rsrain,ros rm eevtos om weeue_d=3 hr sri 7 adrsrain.omi =rosro_d n eevtosro_d om.omi
philip.greenspun.com/sql/queries.html 10/11

10/20/12

Queries

Step 4: test Test the query to make sure that you haven't broken anything. You should get back the same rows with the same columns as before. Step 5: add a new column to the SELECT list We're finally ready to do what we set out to do: add r o _ a eto the list of columns for which we're querying. omnm
slc rsrain.omi,sattm,edtm,rosro_ae eet eevtosro_d tr_ie n_ie om.omnm fo rsrain,ros rm eevtos om weeue_d=3 hr sri 7 adrsrain.omi =rosro_d n eevtosro_d om.omi

Reference
Oracle8 Server SQL Reference, SELECT command section

Next: complex queries philg@mit.edu Add a comment

philip.greenspun.com/sql/queries.html

11/11

You might also like