Professional Documents
Culture Documents
(DEMO ABSTRACT)
517
the system or at least look at several applications in some
detail. This is especially true with respect to understand-
ing just how flexible the DEVise visual model really is. We
refer the reader to the DEVise paper that appears in these
Q I)*,
..!
V,*
V,su,l,mms
I
. . . . . .
--------------- I,xm”lr”,,-,1.vw
the reader to understand how it goes beyond other related &- ...... ..--.&---..-.-.-.g*sFr.b.
tools. We propose to demonstrate each of these scenarios at
SIGMOD. (For details on these applications, including ex-
ample full-color DE Vise screens, see the DE Vise home page
at http: //vrru. CS. riisc. edu/-devise)
Financial Data Exploration: In collaboration with Figure 2: DEVise integration model.
the Applied Securities Analysis program in the UW Busi-
ness SchooI, we’ve developed an environment for integrated
visual exploration of financial datasets from several vendors, images are looking for correlations in the records that can
including Compustat, ISSM and CRSP. This application be used to predict pathological features in the associated
illustrates DEVise’s abtity to access data from a variety images. Using DEVise, we have developed a visual pre-
of formats, without requiring users to store all data in a sentation that allows a biologist to extract records satisfy-
common repository, and its use in integrating information ing certain selection criteria, identify subsets of the selected
from many sources-users can now look for correlations and records that satisfy further conditions, and then retrieve the
trends using the combined information from a variety of ven- associated images at any desired level of resolution. The
dors (Figure 3). development of the DEVise application was done using a
R-Tree Validation: The well-known R-tree multidi- visuaf interface, using the notions of views, mappings, links
mensional index organizes a collection of points and boxes etc. supported by DEVise, and the biologists’ exploration
(which ‘bound’ spatial objects). Each leaf node (page) con- is also done entirely through a visual interface supporting
tains several points or boxes, and each index node contains DEVise’s notion of visual queries (Figure 5).
several boxes (each of which ‘bounds’ all the contents of a Soil Sciences Classification: This application illus-
child page). While developing R-tree algorithms, it is im- trates an important point: users often want to generate var-
portant to understand how different datasets are ‘packed’ ious kinds of summaries of their data, explore the summary
into R-trees, and this can be accomplished naturally by vi- information, and then be able to interactively look at the
sualizing the tree. An R-tree can be visualized in DEVise as ‘corresponding’ portion of the underlying data. This makes
follows. First, note that each box is a data record with fields it necessary for the visualization component of DEVise to
(xI., vi., xh,, vk, ); this information can be used to ‘map’ each understand the semantics linking the summary and the sum-
data record to a rectangle on-screen. By mapping all records marized data. A research group in Soil Sciences is working
in a node, we can ‘see’ the node as a collection of boxes, and on automatic classification of forestry-canopy images, which
by mapping all the nodes in a given level, we ‘see’ a hon- are being generated in large numbers as part of the BOREAS
zontaf slice of the R-tree. Given such a visual presentation, field experiments. They want to process images and classify
the visual operations supported in DEVise allow a user to the pixels into categories like ‘trees’ and ‘sky’, and even
explore the tree, level by level, to scan around in a level and ‘branches’, ‘soil’ , ‘sunlit leaves’, etc. We’ve combined a tool
on a page, to zoom into a specific region of the tree, and called BIRCH [3], which was developed for finding clusters
even retrieve individual data records (‘boxes’ in leaf nodes, of points in multidimensional datasets, with DEVise to cre-
in this example). We note that we used the visuafization to ate an analysis environment that they are currently using
find some subtle bugs in our R-tree bulk loading algorithms on a daily basis for classifying images. (For details of this
that would otherwise have been extremely difficult to spot application, see the DEVise paper in this volume. )
(Figure 4).
FamiIy Medicine and NCDC Weather Data: DE
2 DEVise System Architecture and Optimization
Vise is being used by the UW Family Medicine department
to provide physicians access to data that is collected and
The current version of DEVise contains over 100,000 lines of
maintained independently by five cfinics in the Madison
C++ code and about 20,000 lines of Tcl/Tk code. The data
area. In addition to the clinic data, which is presented vi-
import, buffer management, query processing, and visual-
sually in such a manner as to allow physicians to look for
ization facilities are located in the DEVise Server, while the
certain trends and correlations, we provide uniform access
GUI runs as a separate DEVise Client. Several client types
to weather data for the Madison area from the National Cli-
are supported, including a Tcl/Tk client, Java client, and a
mate Data Center (NCDC) data repository (Figure 1).
batch client. The batch client is used when DEVise acts as a
Cell Image-set Exploration: In this application, we
back-end visualization engine for a Web site. Web resource
are working with biologists who are dealing with large sets
requests (cgi-bin URL’S) are translated to DEVise visualiza-
of images of cells, where each cell image has an associated
tion requests and forwarded to DEVise via the batch client.
record with over 30 fields, containing information about
The resulting image is transported back to the Web client
when and where the image was recorded and details about
which then produces new zoom/scroll requests baaed on the
the content of the image. The biologists working with these
user’s actions. An image map file produced by DEVise lets
518
Lam Daily max temperature and clinic visits in 1995
I
In ho~p delivery and post-natal visRs In hosp dellvery and pre-natal visits
10
I
1
.. .1, . .
Figure 1: Visualization of Hospital Clinic Visits. Top window compares Madison temperature pattern (line graph) with
number of clinic visits. The middle windows compare the number of pm-natal and post-natal clinic visits (thin bars) with
the total number of in-hospital delivery of newborn babies. The bottom window shows the same data as percentages.
the Web user select and zoom in to one of potentially many oneered several novel concepts in the areas of visualization
grapha in an image. paradigms and implementation techniques, in particular, is-
The client-server architecture also provides the frame- sues related to visualization of very large datasets.
work for collaborative computing. This is achieved by repli-
cating the client-server dialog on other servers. We note that Acknowledgements
DEVise servem can be geographically dispersed and there-
fore only exchange meta-data. Collaborating servem get ac- The work presented in this paper was supported by NASA
cess to the datasets via their own data import mechanisms; grant NAGW-3921 and as a Massive Digital Data Systems
data might be on a local disk on some servers, while other (MDDS) project sponsored by the Advanced Research and
servers will download and cache the data from a network Development Committee of the Community Management
source. Stail Raghu Ramakrishnan’s work was additionally sup-
The DEVise back-end is essentially a middleware database ported by a Packard Foundation Fellowship in Science and
engine. It is used to execute queries on both local and re- Engineering, a PYI Award with matching grants from DEC,
mote data systems. We use the term data provider because HP, IBM, Tandem, and Xerox, and NSF grant IRI-9011563.
the data may be in a variety of heterogeneous sources. Ex-
amples include local file systems, the World Wide Web, re-
References
mote DEVise engines, or foreign databases. The DEVise
engine accepts queries referring to multiple data providers, [1] J. A. Blakeley. Data access for the masses through ole-
performs global query optimization, and decomposes the db. In Proceedings oj ACMSIGMOD,Montreal, Canada,
query into a set of subqueries to be shipped to the apprm 1996.
pnate sites. The engine assembles the result, performing
necessary operations that were not done at individual sites. [2] A. Silberschatz and S. Z. et al., editors. Proc. NSF Work-
The DEVise integration model is shown in Figure 2. shop on Stmtegic Directions in Computing, Cambridge,
MA, 1996.
3 Summary [3] T. Zhang, R. Ramakrishnan, and M. Livny. Birch: An
efficient data clustering method for very large databases.
DEVise is a major effort underway at Wiscomin to combine
In Proceedings of ACM SIGMOD, Montreal, Canada,
database querying and visualization, and the tool is already
1996.
in widespread use in real applications. The project has pi-
519
I I I
F@re 3: Visual Integration of IBM Stock Thding Data (Price and Volume) from Two Sources. The thin and thick stock
price line graphs allow easy cross-validation of the two data sources. Stock prices are compared with Standard & Poor’s 500
Index (upper line in top window),
I
a
\
11 1. —
lb
1 I
Figure 4: R-Tree Validation. Root node (top) shows location Fiu 5: Visual Exploration of Cell Features and Immzes.
in the tree of other views. Lower objects (level 2 in middle Tie image shown ii a the result of a two-stage selec~on
and leaf in bottom) are filled with a pattern. The level above based on feature location (top right) and elapsed time (top
is superimposed with thick outlines for depth-cueing. left ).
520