Professional Documents
Culture Documents
Average Total PO
95.00
82.50
between them in spreadsheet programming. However, we
a given customer
observe that this situation is not due to a defect in the pivot
table itself. It has to do with the approach that consists in
integrating the spreadsheet directly with the resource layer
Figure 2: Relationship navigation
(see Section 2.2 and Figure 1), where resources are hetero-
geneous. Integrating with the service layer allows to build a
unique solution to access a variety of systems, and closeness
tomer, say ‘Prefect’. Accessing the underlying ER model,
in this case would hopefully become irrelevant.
represented on the left of the figure, means here that we
We also want to emphasize that a relationship navigation
can display additional details about this customer, say his
is different from a parametric importation. The difference
first name and address, or list of his recent purchases. That
is that relationships are pre-built in the conceptual model
is, the value is not seen as isolated, but as an element of a
and users don’t need to express them—they are ready to
larger composite entity, here the customer, in relationship
use. Relationships are precious when the parameters are
with other entities, e.g. purchase orders.
not trivial (for example when obtaining the purchase orders
The Pivot Table, a feature found in MS Excel, implements
of a customer involves to join several tables with composite
such a mechanism and, incidentally, offers an illustration
foreign keys). Put it roughly, relationship navigations are
of the shortcomings that integration solution should avoid.
to parametric data access what SDOs are to SQL.
The pivot table allows to compute an aggregate, e.g. a sum
Now that we have a clearer picture of what to expect from
or an average, of a collection of values grouped along some
an integration solution with SOA, we propose to examine
dimensions. It is the spreadsheet representation of an OLAP
existing approaches to data importation and see how well
cube. It takes the form of a table, i.e. a collection of cells,
they do along these five dimensions.
where horizontal and vertical headers represent the chosen
dimensions and where each non-header cell is an aggregate. 3.2 Review of existing approaches
A right-click on an aggregated value pops up a context menu
We observed that approaches for data importation and
that offers to display the details of this aggregate; that is
manipulation could helpfully be classified into two categories:
the list of individual values that were summed or averaged
formula-based importation and external mapping definition.
to produce this cell content. This is a form of relationship
For each, we picture below its main traits, give some ex-
navigation. The aggregated value is in relation with the
ample of commercial products, and we discuss their relative
individual values used to compute the aggregate. However,
merits according to the criteria identified in the previous
the navigation experience offered by this method poses two
section.
problems: it is fixed and closed.
The navigation is fixed since it is not possible to specify 3.2.1 Formula-based importation
how the details are displayed. In the pivot table case, they
In this model, the grid representation of external data is
are displayed in a new worksheet as a table where each col-
obtained from formula evaluation as illustrated in Figure 3.
umn is a dimension and each row an individual value. But
Examples of this method for MS Excel include the built-in
suppose that you modify this new worksheet to compute
functions for database (e.g. DGET, DSUM, DAVG, etc.) and the
some custom aggregations, say a sum of all values greater
Real Time Data (RTD) provider. The function DGET(x,y,z),
than 100. If you happen to need a similar computation for
when used in a cell formula, retrieves from a database x the
some other aggregate values, you unfortunately will have to
value corresponding to attribute y of a tuple identified by
do the work again, as the details of each aggregate is going to
z. RTD is an extension mechanism that can be used by a
be displayed in its own newly created worksheet.There are
professional developer to provide access to dynamic values.
workarounds. For example, the computation can be pro-
They correspond respectively to a push and a pull model of
grammed in a separate worksheet that refers the the work-
data importation. In addition to those features, a profes-
sheet automatically produced by this tool. But this method
sional developer can easily extend the library of functions
is more complex. What would be needed here is the possi-
available in the formula language by User Defined Func-
bility to customize the navigation.
tions (UDF). Figure 3 illustrates this approach with a UDF
The navigation is also closed since the details of an aggre-
Customer that takes the customer number and an attribute
gate are accessible only from within the pivot table, through
as parameter.
this particular context menu. It is not possible, for exam-
The main advantage that derives from using formula for
ple, to access the details of an aggregate from another tool
data importation is that it is perfectly in line with the
or from a formula expression, nor is it possible to refer to the
spreadsheet programming model and, thus, share its good
origin of a cell value from a formula or another tool. Only
properties:
the resulting aggregate value is accessible, the structure to
which it belongs, a lattice in this case, is known to the pivot • The grid representation can be built incrementally and
table since it allows users to drill-down, roll-up or display the each cell can individually be modified;
Customer 001 External Mapping Definition
Prefect
Customer(001, firstName) C2
Ford Spreadsheet application Customer 001 Customer(001, lastName) B2
A B C D Prefect
Customer 002 Ford
Customer 1 Customer(>001, {lastName, firstName}) B3:C4
lastName Dent
Arthur 2 =Customer(001, ‘lastName’)
Customer Customer 002
firstName Dent
3 lastName
Arthur Spreadsheet application
4 firstName
A B C D
Customer 003
Beeblebrox Customer 003 1 Last name First name
In this example, three different feeds have been accessed in Channel and access it from any cell containing a composite
three different rows of the spreadsheet. Details of those feeds value of that type (e.g. cell C6 Figure 6(b)).
are provided in other columns. For row 8, the last refresh
date and the most recent news item are accessed using, for-
mulas <A8>.LastModified and <A8>.Channels[0].Items[0] 6. RELATED WORK
respectively. Again, these formulas can be entered by us- We already reviewed in Section 3.2 the existing approaches
ing the Object Explorer, as described below. Other rows to data importation in spreadsheet and our discussion was
use similar formulas, with only the cell reference changed. focused on mainstream spreadsheets. But spreadsheets have
This illustrates that any level in the nested structure of the received from the research community a sustained attention.
RSS document can be freely laid out on the worksheet. It Several proposals have been made to extend spreadsheets in
also shows that the individual components of the compos- order to introduce features found in conventional program-
ite value contained in a cell can be accessed directly within ming languages and make these features easy to exploit for
formula expressions using only cell references, hence avoid- end-users.
ing the impedance mismatch problem faced in traditional An early work in that area is the Analytic Spreadsheet
spreadsheet programming. Package (ASP) [16] where the language Smalltalk 80 is used
The composite value contained in cell A8 can also be dis- to build a spreadsheet where cells can contain instances of
played in details using a template corresponding to its type objects. Object visualization within cells is provided either
(see section 4.2). Figure 6(b) shows a detailed display of the by the default transtyping mechanism offered by smalltalk
RssFeed instance of cell A8. In this template, all formulas with the printString protocol or by instanciating objects
use the keyword obj to refer to the current instance to be that derive from DisplayObject to build custom visualiza-
displayed. For example, the formula used in cell C6 Figure tion.
6(b) is obj.Channels[0]. This formula returns an object of A more advanced integration of object-oriented features,
type RssChannel, whose default string representation is the as well as functional programming, into the spreadsheet en-
title of the channel. vironment was proposed in [9]. This work extends the tra-
The Object Explorer can be seen on the right of Figure ditional spreadsheet to support programming abstractions
6(b). In this mode, it displays the details of the type con- such as encapsulation, reuse, recursive functions, higher or-
tained in the selected cell which, in our case, corresponds to der functions or polymorphisms. It defines a full spreadsheet-
the type RssChannel since cell C6 is selected. The Object based language where worksheets are seen as methods and,
Explorer can be used to build a layout by simply selecting when grouped in a workbook collection, collectively define
one of the properties of the object contained in the selected a class.
cell, and dragging the corresponding node over to an empty In [12], an extension to Excel is proposed to allow end-
cell. This results in the formula corresponding to that node users to build custom functions. In this approach, the type
being copied into the cell. For instance, one can see that the system of the spreadsheet is extended so that whole matrices
node description is selected, and the corresponding formula can be stored in a single cell. A cell that contains a matrix
is displayed in the bottom part of the panel. is displayed in a different way so that end-users clearly know
The detailed layout of Figure 6(b) can be reused for any that their content is composite. Forms/3 [7] is a prototype
instance of RssFeed class. Hence, similar details can be ob- that implements several extensions to spreadsheet program-
tained for any of the three RssFeed objects on Figure 6(a) ming. It allows for instance recursive computations, or ex-
through a simple click. Moreover, it is possible to create ception handling. Cells in Forms/3 can contain any type of
several templates, corresponding to as many views of com- data.
plex data of a given type. Templates can also be nested. For All these approaches are very interesting; they explore
instance, it is possible to create a template for the type Rss- how to redefine the spreadsheet programming model in order
to bring into it the powerful abstractions found in other
languages. The integration of spreadsheets in SOA would 8. REFERENCES
be much facilitated if mainstream spreadsheet applications [1] Excel Services Overview. Technical report, Microsoft
(and their user-base) decided to adopt some of the ideas Corp., 2006.
proposed in those works. For example, if Excel actually [2] ADO.Net Tech Preview Entity Data Model. Technical
handled matrix types as cell values, we could benefit from report, Microsoft Corp., June 2006.
this type system and propose a richer mapping of composite
[3] Designing Reports with the Microsoft Excel Add-in for
external objects.
SQL Server analysis services. Microsoft Corp., 2004.
Our concern in this article is almost opposite since we
[4] SAP NetWeaver: A Complete Platform for
precisely try to leave the spreadsheet programming model
Large-Scale Business Intelligence. Technical report,
untouched. SpreadATOR acts as a middleware; its formula
Winter Corp., 2005.
language could as-well be hidden to users who could choose
to rely solely on the visual assistant (the object explorer). [5] G. Alonso et al. Web Services - Concepts,
SpreadATOR for instance do not provide any mechanism to Architectures and Application. Springer-Verlag, 2004.
actually build those object abstractions. Thanks to this, we [6] A. Blackwell and T. Green. HCI Models, Theories,
were able to implement our prototype as an add-in to an and Frameworks: Toward an Interdisciplinary Science.
existing spreadsheet application. We try to bring to end- J.M. Carroll Editor, chapter Notational systems – the
users this small part of the benefits of programming at the cognitive dimensions of notations framework. Morgan
conceptual level that we think does not imply any major Kaufmann, 2003.
change in the way they already work with spreadsheets. [7] M. Burnett et al. Forms/3: A first-order visual
language to explore the boundaries of the spreadsheet
paradigm. Journal of Functional Programming,
7. CONCLUSION 11(2):155–206, 2001.
[8] M. Carey. Data delivery in a service-oriented world:
In this article, we have defined the problem of spread-
the BEA AquaLogic data services platform. In
sheet integration with data services from the viewpoint of
SIGMOD’06, pages 695–705, New York, USA, 2006.
spreadsheet users. We tried to answer how developers could
[9] C. Clack and L. Braine. Object-oriented functional
benefit from the higher level and integrated view of IT re-
spreadsheets. In GlaFP’97, september 1997.
sources offered by service-oriented architecture. More specif-
ically, we discussed how to leverage, in the importation and [10] K. Gile. Keeping IT sane in a crazy BI world of Excel.
manipulation of data, the conceptual modeling of informa- Technical Report 36353, Forrester, 2005.
tion as provided by data services and API such as SDO or [11] A. Halevy et al. Enterprise information integration:
ADO.Net. We identified the shortcomings of existing solu- successes, challenges and controversies. In
tions and proposed a novel approach to spreadsheet integra- SIGMOD’05, pages 778–787, New York, USA, 2005.
tion called SpreadATOR. [12] S. P. Jones, A. Blackwell, and M. Burnett. A
An important aspect of SpreadATOR is that it can be in- user-centred approach to functions in excel. In ICFP
tegrated with existing spreadsheet applications such as MS ’03, pages 165–176, New York, NY, USA, 2003.
Excel. It does not suppose an extension of the spreadsheet [13] K. Laker. Exploiting the power of oracle using
language and can act as a middleware. In the same time, its microsoft excel. Technical report, Oracle Corp., 2004.
interface blends with MS Excel. This allows users who have [14] E. Lippert and E. Carter. .Net programming for office:
the need to easily introduce programmatic aspects in their using C# with Excel, Word, Outlook and Infopath.
importation, for example using a cell reference to make it Addison Wesley, 2005.
parametric. An additional benefit is an improved readabil- [15] B. A. Nardi and J. R. Miller. The spreadsheet
ity of the mapping. Finally, the object explorer offers the interface: A basis for end user programming. In
necessary support to avoid formula input and is very similar INTERACT’90, pages 977–983. North-Holland, 1990.
to the schema mapping tool already proposed in Excel. [16] K. W. Piersol. Object-oriented spreadsheets: the
The superiority of specialized importation tools over a analytic spreadsheet package. In OOPLSA’86, pages
generic approach followed by SpreadATOR is their capacity 385–390, New York, USA, 1986. ACM Press.
to provide very specific wizard dialogs or visual metaphors [17] F. Rice. Creating XML mappings in excel 2003.
to assist users (e.g. they can refer to dimensions when ac- Technical report, Microsoft Corp., 2005.
cessing an OLAP server and to tables with accessing a re- [18] C. Scaffidi, M. Shaw, and B. Myers. Estimating the
lational database). However, we believe that SpreadATOR numbers of end users and end user programmers. In
is in fact compatible with these high-level features. We ar- VL/HCC’05, pages 207–214, 2005.
gue that these specific assistants should output a mapping
[19] Next-generation data programming: Service data
definition in a common formula-based importation language
objects. Technical report, IBM, BEA, 2003.
such as the one introduced in this article. It can easily be
done since, as demonstrated, SpreadATOR is able to work [20] http://www.service-architecture.com.
with any (.Net) API. For end-users, the benefit is a spread- [21] K. Williams and B. Daniel. An introduction to service
sheet application over which they have a complete control data objects. Java Developer’s Journal, October 2004.
as well as the possibility to combine various importation
tools in a same application. By using a common mapping
definition, importation systems would also leverage the com-
mon facilities offered by SpreadATOR such as the template
mechanism or the meta-data management and save signifi-
cant development time.