Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Standard view
Full view
of .
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1
Making Database Systems Usable

Making Database Systems Usable

Ratings: (0)|Views: 104|Likes:
Published by vthung

More info:

Published by: vthung on Apr 10, 2009
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





Making Database Systems Usable
H. V. Jagadish Adriane Chapman Aaron ElkissMagesh Jayapandian Yunyao Li Arnab Nandi Cong Yu
University of MichiganAnn Arbor, MI 48109-2122
Database researchers have striven to improve the capabilityof a database in terms of both performance and functional-ity. We assert that the
of a database is as importantas its capability. In this paper, we study why database sys-tems today are so difficult to use. We identify a set of fivepain points and propose a research agenda to address these.In particular, we introduce a
presentation data model 
direct data manipulation 
with a
schema later 
approach. We also stress the importance of 
across presentation models.
Categories and Subject Descriptors
H.2.0 [
]; H.5.0 [
General Terms
Design, Human Factors
Database, Usability, User Interface
Database technology has made great strides in the pastdecades. Today, we are able to efficiently process ever largernumbers of ever more complex queries on ever more humon-gous data sets. As a field, we can be justifiably proud of what we have accomplished.However, when we see how information is created, ac-cessed, and shared today, database technology remains onlya bit player: much of the data in the world today remainsoutside database systems. Even worse, in the places whereSupported in part by NSF grant IIS 0438909 and NIHgrants R01 LM008106 and U54 DA021519. We thank MarkAckerman, Jignesh Patel and Barbara Mirel for their com-ments on a draft of this paper.
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.
June 11–14, 2007, Beijing, China.Copyright 2007 ACM 978-1-59593-686-8/07/0006 ...
database systems are used extensively, we find an army of database administrators, consultants, and other technicalexperts all busily helping users get data into and out of adatabase. For almost all organizations, the indirect cost of maintaining a technical support team far exceeds the di-rect cost of hardware infrastructure and database productlicenses. Not only are support staff expensive, they alsointerpose themselves between the users and the databases.Users cannot interact with the database directly and aretherefore less likely to try less straightforward operations.This hidden opportunity cost may be greater than the vis-ible costs of hardware/software and technical staff. Mostof us remember the day not too long ago when booking aflight meant calling a travel agent who used magic incanta-tions at an arcane system to pull up information regardingflights and to make bookings. Today, most of us book ourown flights on the web through interfaces that are simpleenough for anyone to use. Many enjoy the power of beingable to explore options for themselves that would have beentoo much trouble to explain to an agent, such as willingnessto trade off price against convenience of a flight connection.Search engines have done a remarkable job at directlyconnecting users with the web. The simple keyword-basedquery mechanism allows web users to issue queries freely; thealmost instantaneous response encourages the user to refinequeries until satisfactory results are found. While searchengines today are still far from perfect, their huge successsuggests that
is key. An information system pro-vides value to its users through the ability to get informationinto and out of the system easily and efficiently. Unfortu-nately, databases today are hard to design, hard to modify,and hard to query.One obvious question to ask is whether database systemscan simply have a search engine sit on top of them and letthe search engine handle the interaction with the users. Theanswer, we argue, is “No.” While the search engine interfaceworks well for the web, it does not address all the usabil-ity problems database systems are facing. This is due tomany characteristics that stem from users’ expectations forinteracting with databases, which are fundamentally differ-ent from expectations for the web.The first characteristic that users expect is the ability toquery the database in a more sophisticated way. While usersare content with searching the web with keywords, they wantto express more complex query semantics when interact-ing with databases simply because they know databases aremore than collections of documents and there are structures
inside databases that can be used to answer their queries.The use of boolean connectives, quote phrases, and colonprefixes are just the beginning of the complex semantics wecan expect from database users. For example, when query-ing an airline database for available flights, a user will natu-rally search on explicit travel dates, departure/arrival citiesor airports, and sometimes even airlines. Such complex se-mantics cannot be handled by a keyword-based interfacethat search engines provide.The second characteristic is that users expect more pre-cise and complete answers from database search. One of thehidden reasons behind the success of search engines is thefact that web users can tolerate a search that returns manyirrelevant results (i.e., less than perfect precision) and theytypically do not care if there are relevant results missing(i.e., less than perfect recall), as long as
relevant re-sults are returned to them. In the database world, however,these assumptions no longer hold. Users issuing a databasesearch expect to obtain
the relevant results:anything less than perfect precision and recall will have tobe explained to the user.The third characteristic is that users have an expectationof structure in the result. In the case of a web search, a userexpects simply a set of links, with almost no interrelation-ship between them. In the case of a database search, a usermay expect to see a table, a network, a spatial presentationon a map, or a set of points in a multidimensional space —the specific structure depends on the users mental model of the application.The fourth characteristic is that users often expect to cre-ate and update databases. While search engines are allabout searching, database users frequently want to designnew databases to store their own information or to generateinformation to put into existing databases. The usabilityissues in designing a database structure from scratch andcreating structured information for existing databases arecompletely unexplored so far and need to be addressed fordatabases to become widely used by ordinary users.These fundamental characteristics suggest that the data-base research community needs to think about database us-ability not just as a query interface, but as a more compre-hensive and integral component of the database systems.Structured query models like SQL and XQuery are thecurrent means provided by database systems to allow usersto express complex query semantics. While powerful, thosemodels are fundamentally difficult for users to adopt becausethey require users to fully comprehend the structure of thedatabase (i.e., schema) and to express queries in terms of that particular structure. We argue that while the logicalschema of the database is an appropriate abstraction overthe underlying physical schema for data organization, it isstill at a level too low to serve as the abstraction with whichusers should interact directly. Instead, a higher level
pre-sentation data model 
abstraction is needed to allow usersto structure information in a natural way. We describe thisproblem of “painful relations” in Section 4.1. Because differ-ent users have different views (i.e. presentation data models)on how information should be organized, one would natu-rally like those presentation data models to be personalizedfor individual users. Our experience with the MiMI [51]project, however, has told us a different story. When usersare presented with multiple ways to access the informationbut do not understand the underlying differences betweenthe views, they tend to become confused and lose trust inthe system. A good presentation data model can be enor-mously helpful, but too many such models becomes counter-productive. We describe this problem of “painful options”in Section 4.2.The expectation of perfect precision and recall for databasesearch introduces yet another usability issue: the need to is-sue explanations to the user when the database system pro-duces unexpected results or fails to produce the expectedresults. The first case is a failure in precision—some of the results produced are not relevant in the mind of theuser. The database system will need to be prepared to an-swer questions like “where does this result come from?” andprovenance tracking [89] is essential in this regard. Second,some potentially relevant results may not be produced—afailure in recall. Some of those missing results can be iden-tified by the system. For example, consider a query askingfor “all flights with booking fee less than $10” and assumethere are flights in the airline database that do not havea known booking fee. It is reasonable, or even desirable,not to return those flights to the user. However, the sys-tem must be able to explain to the user that those flightsare not included in the result because their booking fees areunknown, and not because their booking fees exceed $10.Some missing results cannot ever be identified by the sys-tem. For example, consider a query asking for “a Tuesdayflight from DTW to PEK on any airline” and assume the air-line database does not carry flights from Northwest Airlines,which does have a flight between DTW and PEK. While thesystem may not be aware of the existence of such a flightwith Northwest Airlines, it still needs to explain to the userthat this omission is due to its incomplete coverage. Wedescribe this problem of “unexpected pain” in Section 4.3.Another way in which a user can get unexpected resultsfrom a system is when the user makes an error in specify-ing the query. Unfortunately, given how difficult it can beto specify a query correctly, such errors are all too com-mon. A fundamental difficulty in a traditionally architecteddatabase system is that the user has a labor-intensive queryconstruction phase followed by a potentially lengthy queryevaluation phase, so that after the results are obtained thecosts to reformulating the query are often high. Worse still,errors in query construction remain uncaught until the con-struction phase is completed and the query submitted forevaluation. This sort of write and debug cycle is somethingwe computer scientists get used to as part of learning howto program. Other users need mechanisms so that they cansee what they are doing with the database at all times. Thisabsence of the ability to manipulate data directly is what wecall “unseen pain” and discussed in Section 4.4.Finally, the users’ expectation of being able to create adatabase from scratch and/or to create structured informa-tion for an existing database introduces a whole new set of usability issues that are currently unexplored. Requiring auser to go through a rigorous schema design process beforeany information can be stored, as in the case of many cur-rent database design practices, puts too much burden on theuser and runs the risk of user forgoing a database approachaltogether. Similarly, requiring a user to study the exist-ing database schema and restructure their data accordingto this schema before she can update the database with herown data also imposes an unnecessary burden. We describethis problem of “birthing pain” in Section 4.5.
Paper Outline
: The rest of the paper is organized as fol-lows. We describe the current research activities related todatabase usability in Section 2 and describe our ongoingMiMI project [51] as a case study of database usability inSection 3. In Section 4, we describe in detail the usabil-ity challenges facing database systems. Section 5 presentsa research agenda that suggests some directions for futureresearch in database usability. Finally, we conclude in Sec-tion 6.
There is evidence that human error is the leading causefor the failure of complex systems [77, 15]. To this effect, thenature of human usage has received considerable attention inresearch, e.g., there is a recent move in the software systemscommunity to conduct serious user studies [91]. Databaseusability started to receive attention more than 25 yearsago [32]. Since then, research in database usability has beenfollowing two main directions: innovative query interfacedesign (including both visual and keyword-based interfaces)and database personalization. We describe recent accom-plishments in those areas, as well as other related areas likeautomatic configuration and management of database sys-tems.
2.1 Visual Interface
Visual query specification interface is perhaps the oldestand most prominent field related to database usability, interm of both academic research (e.g., QBE [100]) and in-dustrial practices (e.g., Microsoft Access and IBM VisualXQuery Builder). Many visual query interfaces have beenproposed to assist users in building queries incrementally,including XQBE [13], MIX [71], Xing [37], VISIONARY [9],Kaleidoquery [73] and QBT [85].Forms-based query interface design has also been receiv-ing attention. Early works on such interfaces include [26, 36]and provide users with visual tools to frame queries and toperform tasks such as database design and view definition.The GRIDS system [84] generates forms that allow usersto pose queries in a semi-IR, semi-declarative fashion; theAcuity project [90] developed form generation techniquesfor data-entry operations such as updating tables. More re-cently in XML database systems, efforts have been made toshield users from both the details of the XQuery syntax andthe textual representation of XML. FoXQ [1] and EquiX [28]are systems that helps users build queries incrementally bynavigating through layers of forms. Semi-automatic formgeneration tools have been proposed in QURSED [79]. Fur-thermore, [93] proposes the use of XML rather than HTMLto represent forms, making them more reusable, scalable,machine-readable and easier to standardize. Another in-teresting project in UI design is DRIVE [68], a runtimeuser interface development environment for object-orienteddatabases, which uses the NOODL data model to enablecontext-sensitive interface editing.
2.2 Text Interface
Our sister field of information retrieval has had wideradoption by normal users. This has prompted database in-terface designers to take the approach of providing databasesystems with an IR-style keyword-based search interface.Many systems such as DBXplorer [2], BANKS [11], DIS-COVER [47] and an early work of Goldman et al. [40] at-tempt to extend the simplicity of keyword search to rela-tional data. This is not merely an integration of full-textsearch with relational query capability—such an approachstill requires knowledge of the database schema. Rather, thecore principle is to provide keyword search
tuples.Most of these systems find tuples that individually matcheach keyword and then find a path of primary/foreign keyrelationships that relate the tuples. Result ranking is typ-ically provided based on the path of primary/foreign keyrelationships. A common ranking approach is to use somevariation of PageRank [14], where documents and hyperlinksare replaced with tuples and referential constraints.A parallel thread of research examines keyword search inXML databases. The basic problem is the same as in re-lational databases: given a set of keywords, we must finddata fragments that match the individual keywords. In-stead of only referential constraints, XML databases mostlyhave parent/child relationships between individual elements.The problem of determining if a data fragment is meaning-fully related becomes much more important. Approaches todetermine the meaningfulness as well as the relevance of adata fragment have ranged from simple tree distance used byXSEarch [29] to XRANK’s adaptation of PageRank [41] toapproaches such as Schema-Free XQuery [62, 63] that lookat either the entire database or the database schema [98] todetermine if a data fragment is meaningful.A different approach with a long history is the construc-tion of natural language interfaces to databases [5]. Naturallanguage understanding is an extremely difficult problem,and commercial systems such as Microsoft English Query [12]tend to be unreliable or unable to answer questions outsidea manually predefined narrow domain [83].Other systems assume the user has some imperfect knowl-edge of the structure of the data as could occur with het-erogeneous or evolving schema. This is an intermediate stepin user complexity between pure keyword search and rigidstructural search. Research such as FleXPath [4] and thework of Kanza and Sagiv [54] has focused on relaxation of fully specified structural queries; other systems such as Ju-ruXML [21] support querying and result ranking based onsimilarity to a user-specified XML fragment.A more recent trend in keyword-based search is to analyzea keyword query and automatically discover the hidden se-mantic structures that the query carries. This trend has in-fluenced the design of projects for both traditional databasesearch [52] as well as web search [65].
2.3 Context and Personalization
Advancements in query interface design, while making iteasier for users to interact with the database, are mostlygeneric: they do not take into account the specific userand her unique problems. The notion of 
ad-dresses this problem by attempting to customize databasesystems for each individual user [33]. This approach hasreceived great attention in the context of websites [80, 69],where the content and structure of the website is tailoredto the needs of each user by analyzing usage patterns. Thenotion of user context and personalization has also foundinterest in the information retrieval community, where theranking of search results is biased using a certain personal-ized metric [45, 46, 53].

Activity (5)

You've already reviewed this. Edit your review.
1 hundred reads
rarariz liked this
rarariz liked this
Elham Abied liked this
hasan hawasle liked this

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->