Professional Documents
Culture Documents
"#$%&123456789@ACDEFGHIPQRS`ye|
M ASARYKOVA UNIVERZITA
FAKULTA INFORMATIKY
Luk s Sembera
a
Declaration
Hereby I declare, that this paper is my original authorial work, which
I have worked out by my own. All sources, references and literature
used or excerpted during elaboration of this work are properly cited
and listed in complete reference to the due source.
Luk s Sembera
a
Acknowledgement
I would like to thank my technical advisor Ji Pechanec from Red
r
Hat Czech for his valuable comments and suggestions. I would also
like to thank my anc e Daria for her support during writing.
e
iii
Abstract
This thesis aims to compare three implementations of the JPA standard specically Hibernate, OpenJPA and EclipseLink. Except the
comparison, it will also describe the migration processes of various
real-world applications between those JPA implementation and document the issues that the developers might typically run into.
The practical part involves developing an application which would
provide a support when migrating projects between those three JPA
providers.
iv
Keywords
JPA, JPA2, Hibernate, OpenJPA, EclipseLink, Java, persistence, relational, databases, Scala
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . .
1.1 Database management systems . . . . . .
1.1.1 Relational databases . . . . . . . .
1.1.2 Object-oriented databases . . . . .
1.1.3 NoSQL databases . . . . . . . . . .
1.2 Object-relational mismatch . . . . . . . . .
1.3 Brief history of Java persistence solutions
1.3.1 JDBC . . . . . . . . . . . . . . . . .
1.3.2 EJB 2.x entity beans . . . . . . . . .
1.3.3 JDO . . . . . . . . . . . . . . . . . .
1.3.4 myBatis . . . . . . . . . . . . . . .
1.4 JPA . . . . . . . . . . . . . . . . . . . . . .
1.5 Goals of the thesis . . . . . . . . . . . . . .
Comparison of JPA providers . . . . . . . . . .
2.1 Methodology of the comparison . . . . .
2.2 Identier generation . . . . . . . . . . . .
2.3 Performance . . . . . . . . . . . . . . . . .
2.3.1 Batch inserts . . . . . . . . . . . . .
2.3.2 Searching by ID . . . . . . . . . . .
2.3.3 Basic JPA QL test . . . . . . . . . .
2.3.4 Basic criteria API test . . . . . . . .
2.3.5 Aggregate function . . . . . . . . .
2.3.6 Performance summary . . . . . . .
2.4 Type conversion . . . . . . . . . . . . . . .
2.5 Caching support . . . . . . . . . . . . . . .
2.6 Entity lifecycle and transactional events .
2.7 Schema generation . . . . . . . . . . . . .
2.8 Support for stored procedures . . . . . . .
2.9 Integrating with other frameworks . . . .
2.10 Licenses . . . . . . . . . . . . . . . . . . .
2.11 Documentation quality . . . . . . . . . . .
2.12 Build systems . . . . . . . . . . . . . . . .
2.13 Summary . . . . . . . . . . . . . . . . . . .
Experimental migration of JPA applications .
3.1 Migrating from Hibernate . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
7
7
8
8
9
9
10
11
11
12
12
13
13
14
15
18
18
19
19
20
20
21
22
24
25
27
28
29
30
31
31
33
33
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
36
38
39
39
41
43
45
46
46
48
50
Listings
1.1
2.1
2.2
4.1
9
16
27
42
46
48
50
List of Figures
2.1
15
4.1
44
List of Tables
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
1 Introduction
Every application, except the most basic ones, has to deal with data.
The very rst computers were designed as black boxes receiving input, doing some calculations and producing output. Since then, computers have become much more complicated and nowadays they do
much more than such simple data processing. Nevertheless, they still
operate with data stored on some kind of a permanent storage device, such as hard drive.
Input data for an application could be saved, without much thinking, into an ordinary text le. However, such les are next to impossible to machine process because they do not follow any rules which
would describe their structure. For this reason, variety of rules the
data have to follow are often introduced (e.g. the structure is described by XML with an appropriate XML Schema denition).
Even if the data are in easily computer-readable form, the biggest
problem with this le-based approach remains. It is still just a text
le and, therefore, the data access is limited by I/O operations of the
operating system. Demands of current enterprise applications, however, go far beyond the possibilities of such le-based persistence. We
require reliability, transaction management, high-performance concurrent access, advanced user access control and much more. To support all of these advanced features, database management systems
have been invented.
1.1
A database management system (DBMS), as dened in [1], is a software designed to assist in maintaining and utilizing large collections
of data. Each DBMS has its model, which describes data, data relationships, semantics and consistency constraints[2]. It is basically a
theoretical foundation, upon which database management systems
operate.
During last few decades, several database models have been invented. In 1960, IBM introduced their database management system
IMS, which internally uses hierarchical database model. Hierarchical
model stores data in records, which are connected with each other
6
1. I NTRODUCTION
through links, creating tree-like structures [2]. An evolution of the
hierarchical model is the network model, which allows records to be
connected in arbitrary graphs and thus making data modelling mode
exible (e.g. allows many-to-many relationships between records).
Even though hierarchical and network databases exist and are still in
use1 , the models have many aws (further discussed in [4]), which
make their usage in certain scenarios particularly complicated.
1.1.1 Relational databases
In 1970, E. F. Codd published a revolutionary paper [5], where he
laid out the concept of the relational data model, which is the theoretical foundation of relational databases. For its exibility2 , simplicity and strong but simple formal background (which allows mathematical reasoning about data) its popularity grew rapidly. A lot of
both commercial and open-source implementations exist; they are
very mature and industry-proven, relational model itself is very well
understood and documented. For these reasons, relational databases
basically mean an industry standard and their knowledge is essential
for every programmer.
1.1.2 Object-oriented databases
In last decade, under the inuence of object oriented programming,
the concept of object oriented (OODBMS) and object-relational database management (ORDBMS) systems has aroused. OODBMS allow
object graphs to be stored to the database directly and are very often integrated with the programming language itself. Thus, they provide homogeneous environment and remove the necessity of various
transformations when data are passed back and forth between application and data layer. Even though object oriented databases have
undeniable benets and advantages, their popularity is not very high.
1. Probably the best known hierarchical database is the Windows System Registry [3]
2. By exibility I mean the ability of the relational model to hide its internal data representation. Clients thus do not need any knowledge, how data are
physically stored and, therefore, are not affected when the server implementation
changes.
1. I NTRODUCTION
Not only because of those enormous amounts data that are already
stored in relational databases (and migration of which would not be
cost free), but also because of some technical issues they are still facing and which are still not yet resolved3 . Moreover, vendors of relational databases are integrating various object-oriented features into
their products and thus are making the need for pure object-oriented
databases less urgent.
1.1.3 NoSQL databases
Recently, with the rise of interest in cloud computing, a new category or databases has occurred, so called NoSQL4 databases. NoSQL
is neither a specic database model, nor an evolution of relational or
object oriented databases, but it is rather a group of database products which are suited to specic scenarios, often where other solutions fail. They often offer only a feature subset of relational databases, but they are superior in certain characteristics. For example
MongoDB is a document-oriented database, which shines at speed
and scalability, but from design decision it lacks decent transaction
management5 and, therefore, its use case is in large clusters where
transactional behaviour is not crucial.
1.2
Object-relational mismatch
Currently, most of the data are stored in relational databases. In programming languages, however, object-oriented approach predominates. It best reects the reality, models interactions of entities and
their behaviour. In object-oriented programming, there are fundamental concepts like association, inheritance or polymorphism, which
do not have corresponding counterparts in the world of relational
databases.
The object-relational mismatch occurs when data, representing
some business information we need to process, are stored in a relational database. In the application processing the data, however,
3.
4.
5.
1. I NTRODUCTION
object-oriented approach is used and everything is modelled using
objects and other OOP concepts. Therefore, transformations are needed
each time data are passed between the application layer and the data
layer. These transformations might not be complicated if the objects
are simple data holders containing only basic data types, but once
we want to have a use of advanced OOP features, things would get
much more complicated.
1.3
1.3.1 JDBC
The object-relational mismatch can be tackled by hand using plain
JDBC, which is Javas API for database access. The API is databaseindependent6 and database vendors provide JDBC drivers for their
systems. Here is a very simple code which saves a person into a database using JDBC:
public void savePerson(Person p) throws SQLException {
String query = "INSERT INTO PERSON VALUES (DEFAULT,
?, ?)";
PreparedStatement stmt =
connection.prepareStatement(query);
stmt.setString(1, p.getName());
stmt.setString(2, p.getSurname());
stmt.executeUpdate();
}
1. I NTRODUCTION
pack primitive properties7 from domain objects and put them into
SQL statements manually. This is especially tedious and error-prone
when we work with larger object graphs and cascaded saving.
So, for reasons above we usually search for a tool or a framework,
which does the tedious work for us, analyses our classes, generates
SQL statements, automatically bounds parameters, etc. In coming
paragraphs, Ill briey discuss different approaches to the Java objectrelational mapping.
It is important to remember, however, that Java database programmers can never avoid JDBC entirely. Since all persistence solutions are built on top of JDBC, its understanding is essential to
ne-tune the persistence framework in certain scenarios or to check
logs in case that something goes wrong. As Gavin King and Christian Bauer in their book state, high-level persistence solutions are not
here for programmers who do not want to learn or do not understand
JDBC, SQL or the relational model itself. They are here for those, who
have already done it the hard way ([6]).
1.3.2 EJB 2.x entity beans
Since the very beginning of the J2EE specication there has been
a technology aiming Java persistence, called EJB entity beans. Entity beans are container managed components providing various services, such as persistence or transaction management. The specication, however, was largely over-engineered from the beginning[7]. It
builds on the fundamental concept that persistence should be nonintrusive to the application and rather be a service provided by the
container. It leads to overwhelming complexity of both the specication and applications using it. EJB entity beans were widely used
technology, but due to its complexity and general unhappiness with
the specication, companies were often forced to create various proprietary persistence solutions. Several open-source frameworks have
also been created, with Hibernate being the most widely used one.
EJB entity beans are for backward compatibility still present in the
Java EE specication, so every Java EE compliant application server
7. By primitive properties I mean properties of primitive data types, which are
directly supported by database systems.
10
1. I NTRODUCTION
has to support it. They are, however, considered as deprecated in
favour of the new JPA specication.
1.3.3 JDO
With a rising frustration from EJB entity beans, there was an attempt
to come with an alternative; with a new persistence specication,
which would work with POJOs8 and nally abandon the concept of
container-managed persistence. This specication is called Java Data
Objects. Even though JDO is quite powerful, in some aspects even
more powerful than its successor, JPA9 ; it failed to get larger popularity among developers and become the mainstream. JDO requires
byte-code manipulation to enhance persistent classes and, therefore,
is quite complicated as well. Mike Keith ([7]) also claims, that one
of the reasons why JDO has failed is its inherently object-oriented
query language, which does not play well with programmers used
to relational databases.
1.3.4 myBatis
myBatis (formerly iBatis) is a lightweight persistence framework that
gives the programmer full control over the SQL queries sent to the
database. It does not generate any SQL code; it merely maps custom
SQL statements to the properties of entities being stored in the database. Despite all the advantages and interesting ideas, it is not a fullblown persistence solution since it lacks features demanded from a
general-purpose persistence framework, such as portability across
different database systems (all the SQL code is database-specic).
Moreover, myBatis is not part of the Java EE specication, which also
means that it does not integrate with the rest of the Java EE ecosystem and, therefore, features like container-managed transaction handling, entity lifecycle callbacks or JSR-303 Bean Validation are not
supported.
8. Abbreviation of Plain Old Java Object, denoting ordinary Java classes which
do not follow any special conventions or framework rules (http://www.
martinfowler.com/bliki/POJO.html).
9. JDO for example supports non-relational data stores, whereas JPA does not.
11
1. I NTRODUCTION
1.4
JPA
EJB 3 specication, as part of completely reworked Java EE 5, released in 2006, contained a new specication regarding persistence
Java Persistence API10 . JPA was a response to users increasing frustration with the complexity of EJB 2.x entity beans. Authors of proprietary persistence frameworks and other experts were invited to
sit in groups working on a brand new Java persistence specication,
which would replace EJB entity beans.
JPA2 (included in Java EE 6, released in 2009) is an evolution of
JPA. It is based on the experience with JPA and reects users critique
(mostly about missing features which are already present in other
proprietary persistence frameworks). In this text I will only focus on
JPA2 specication and its features11 .
1.5
Compare three different JPA implementations and build a feature matrix showing their strengths and weaknesses.
Take an open source project written in each JPA implementation, migrate it to the other two, test it on Oracle, PostgreSQL
and MySQL and document the issues I run into during the
migration process.
Build a migration tool, which will provide a support with migrating OpenJPA and EclipseLink projects to Hibernate.
10. In Java EE 5, the JPA specication is formally a part of the EJB 3 specication. The decision to bind them together was probably quite unfortunate, though,
because JPA is not in any means dependent on EJB container and thus works perfectly ne in Java SE environments. JPA2 is separate specication already, formally
independent of EJB.
11. From this point on, wherever I use JPA, I mean JPA2 specication. I will use
the term JPA2 only to emphasise that a particular feature was introduced in the
new JPA2 standard.
12
2.1
Comparing JPA providers is a tricky task. Since every single JPA implementation has to obey the standard and implement everything
that the standard denes, it might lead to an incorrect conclusion
they are all the same. This is up to some point true; within the boundaries dened by the standard they are all equal. However, each JPA
implementation provides features that go beyond the scope of the
JPA standard. These vendor-specic features extend the frameworks
functionality in various areas.
I have divided the features that go beyond the scope of the JPA
standard into several categories. I will go through all the categories
and describe what features does the particular implementation offer
and which possible alternatives do the others have. In the end, I will
create a summary in form of a short feature matrix summing up the
results of the comparison.
I will mainly focus on features, which are directly related to the
JPA and extend it in some way. Therefore, I will not discuss for exam1. http://www.eclipse.org/eclipselink/downloads/ri.php
2. EclipseLink is used in as default JPA provider in GlassFish, OpenJPA in Geronimo and Hibernate in JBoss AS.
13
2.2
Identier generation
14
2.3
Performance
4GB RAM
Oracle JDK 7
PostgreSQL 9.1.3
To guarantee that tests of all frameworks run under the same conditions7 , I disabled automatic schema generation and created it manually, as shown in listing 2.1.
create sequence sample_sequence start 1000 increment
50;
create table "user" (
user_id bigint primary key,
username varchar(32) not null unique,
enabled boolean not null default true);
create table users_authorities (
user_id bigint references "user",
name varchar(32) not null);
6. Some tests need data already present in the database. Required INSERT statements are, therefore, executed in the initialization phase, the duration of which is
not included in the test result.
7. As seen in appendix A.2, OpenJPA generates some additional indexes, which
would need to be updated with each insert and, therefore, tests execution times
would be affected.
16
18
2
3
863 ms
756 ms
5155 ms 5227 ms
683 ms
666 ms
Average
815.66 ms
5222.33 ms
680 ms
2
3
1138 ms 1013 ms
1086 ms 1078 ms
1133 ms 1256 ms
Average
1051 ms
1120 ms
1166.33 ms
2
3
1530 ms 1523 ms
1094 ms 1136 ms
764 ms
862 ms
Average
1540 ms
1094.66 ms
836 ms
Table 2.5: Fetch all users using criteria API test results
Interesting observation from these results is, that Hibernate performs signicantly slower when using criteria API instead of JPA QL,
whereas EclipseLink, on the other hand, performs slightly faster. Results of OpenJPA are about the same.
19
1
2
3
Average
Hibernate 127 ms 115 ms 100 ms 114 ms
OpenJPA 176 ms 178 ms 177 ms 177 ms
EclipseLink 104 ms 107 ms 104 ms 105 ms
Table 2.7: Complex join using criteria API test results
2.4
Type conversion
JPA specication does not dene any kind of type conversion. For
example, if in the database there is a string eld which stores boolean
values as Y and N strings, there is no way to map it (directly) to a
Java boolean. All JPA implementations, however, provide extensions
which allow map various database types to Java types and also allow
creating user-dened types.
In Hibernate, there is a Type interface. All the types Hibernate
recognizes implement this interface. So, in Hibernate there are classes
like CalendarType mapping Calendar to a datetime, ClassType,
which maps Java Class objects to varchars, etc. However, for creating new custom types, it is generally not recommended to implement
Type directly because it would make custom type converters tightly
coupled with the Type interface and future changes (such as added
or removed methods) would break all custom type converters [12].
For this reason, there is an interface UserType that should be used
for creating custom type converters, which are later adapted to Type
using CustomType.
OpenJPA also provides support for creating custom mappings.
There is an interface ClassStrategy which can be used for creating
mapping between custom classes and database schema. Such class
strategy can be then congured using @Strategy mapping annotation. For creating various custom eld mappings, OpenJPA provides ValueHandler and FieldStrategy interfaces. The latter is a
bit more complicated to implement, but provides more exibility
21
2.5
Caching support
22
ALL
NONE
ENABLE SELECTIVE
DISABLE SELECTIVE
UNSPECIFIED
There is also the @Cacheable annotation used for specifying entitylevel caching mode in case that either ENABLE SELECTIVE or DISABLE SELECTIVE global caching has been set. In case of UNSPECIFIED cache mode setting, provider-specic rules apply. Since caching
is handled in such minimalistic and rather abstract way, various vendorspecic extensions exist.
The approach recommended in Hibernate the documentation is
delegating the caching functionality to specialized caching tools12 .
Thus, Hibernate neatly integrates with the most popular caching frameworks like EhCache of Hazelcast, simply by setting the conguration property hibernate.cache.region.factory class to an appropriate
cache region factory of the selected caching framework.
OpenJPA comes with its own data cache implementation13 , it can
be turned on using openjpa.DataCache conguration property. OpenJPA also contains transaction-events notication framework14 , which
can be used for cache synchronization between nodes in distributed
environment.
EclipseLink also contains integrated second level cache implementation and does not rely on any third-party framework. Like in
OpenJPA, it also supports caching in clustered environment; using
conguration property eclipselink.cache.coordination.protocol it is
possible to specify which protocol should be used for cache coordination between nodes. Possible options described in the documentation [13] are RMI and JMS.
12. Hibernate developer guide, section 21.2
13. There is a plug-in integrating EhCache with OpenJPA, but since it is not even
mentioned in the ofcial documentation, I will not further discuss it.
14. OpenJPA users guide, section 12.2
23
2.6
24
pre/post-transaction commit
pre/post-transaction rollback
pre/post-query execution
...
2.7
Schema generation
In order to keep the specication clean and simple17 , JPA does not
force vendors to generate database schema18 . However, the authors
of the specication took schema generation into an account and integrated various metadata mappings, which can be used during schema
generation, such as nullable or unique annotation properties.
All the compared implementations provide a schema generation
functionality. Generated statements can either be sent directly to the
database or saved into a le for manual execution. Also DROP statements can be generated, so during development it is possible to have
all tables removed and generated again on each startup. This ensures that the application always starts in the same state with an
empty database. However, even though automatic schema generation might be convenient in the development phase, it is often a
17. Another reason why the specication does not contain any details of the
schema generation is that it is very vendor-specic issue. The specication would
have to describe how tables generated from entities should look like on particular database platform, which datatypes or constraints should be used, etc. Since
all these elements differ from database to database very much, the specication
avoids it entirely.
18. JavaTM Persistence API, Version 2.0, page 355
25
19.
20.
21.
22.
23.
http://www.liquibase.org/
http://code.google.com/p/flyway/
http://code.google.com/p/mybatis/
OpenJPA users guide, section 7.2
http://www.sparxsystems.com.au/
26
2.8
27
So, having dened the stored procedure from listing 2.2, the named
stored procedure call denition would look as follows:
@NamedStoredProcedureQuery(
name = "getItemCount",
procedureName = "calculate_item_count",
parameters =
@StoredProcedureParameter(queryParameter =
"result", name = "result", direction =
Direction.OUT))
2.9
Another important decision factor for choosing a JPA implementation is its support for other frameworks which would extend its capabilities even further.
25. It is very briey mentioned in the documentation, without any examples or
further details.
28
2.10 Licenses
When a company is considering implementing their enterprise solution on top of some ORM framework, the character of its license
is very important. Fortunately, all of the compared frameworks are
released under permissive and business-friendly licenses. Hibernate
26. JavaTM Persistence API, Version 2.0, section 3.6
27. Hibernate Validator reference documentation, section 2.4.1
29
28.
29.
30.
31.
32.
http://www.hibernate.org/license
http://openjpa.apache.org/license.html
http://wiki.eclipse.org/EclipseLink/FAQ/General
http://forums.hibernate.org
http://www.eclipse.org/forums/index.php/f/111
30
2.13 Summary
In this section I discussed various vendor-specic extension of the
JPA standard. I summarize this chapter in following brief featurematrix:
Custom identier generation
Performance in the default
conguration
Build-in event notication
framework
Support for custom types
Schema generation
Caching
Nonstandard stored procedures support
Third-party frameworks
support
Documentation quality
Business-friendly license
Hibernate
Yes
OpenJPA EclipseLink
Yes
Yes
2.
3.
1.
Powerful
Yes
Yes
Yes
Yes
Third-party
No
Yes
Powerful
Custom
No
Yes
Yes
Custom
Yes
Yes
No
No
High
Yes
High
Yes
Moderate
Yes
31
32
3.1
Since Hibernate does not ship with any ofcial reference application, I searched at SourceForge 2 for some projects using Hibernate,
which would be suitable candidates for experimental migration. In
the end, I decided for open-forum 3 because it is already managed by
Maven and has convenient and easily understandable project structure. Open-forum describes itself as an opensource forum engine
written in Java. The project is not yet nished, quite a lot of functionality is still not implemented, but as an example project for experimental migration it is ne.
Open-forum uses Spring framework, JSF2 and Hibernate. Interesting fact is that it does not use JPA at all, but rather relies on the
native Hibernate API. Therefore, it will be necessary to migrate complete persistence logic to the standard API. The project also depends
1. I might have chosen AUTO generation strategy, which automatically determines the right one for the particular database, but since I prefer to have control
over what is going to be generated, I congured the generation strategy manually.
2. http://www.sourceforge.net/
3. http://sourceforge.net/projects/open-forum/
33
Rewrite HQL queries because their syntax in some aspects differs from the JPA QL.
Migration of this project was went without any serious complications. It was even not necessary to use any vendor-specic extensions
(with the exception of different conguration property for schema
generation).
34
3.2
http://openjpa.apache.org/samples.html
JavaTM Persistence API, Version 2.0, section 6.4
http://www.hibernate.org/subprojects/jpamodelgen.html
35
3.3
36
The application was saving gender eld of employees in format M and F for males and females, respectively. It was using Eclipse-Link specic object-type-converter to convert such
strings stored in the database into instances of Gender on the
Java side. Since such conversion is not possible in pure JPA, I
had to use vendor specic extensions of Hibernate and OpenJPA.
37
3.4
For OpenJPA, I enabled unenhanced persistent classes on runtime using conguration property openjpa.RuntimeUnenhancedClasses.
Even though the classes should be enhanced automatically
during the deployment on the JBoss application server, I was
receiving an error that they are not. Even though I might have
run compile-time enhancement using Ant or Maven (like I
did in the performance benchmark), I decided to simply allow
classes to be unenhanced in this case.
Migration summary
38
4.1
39
an output directory
2.
The arguments are validated (e.g. if the project directory exists, if the output directory is writeable, some additional access right are checked, etc.)
3.
4.
The list of available processors for the particular JPA implementation is loaded from the application conguration.
5.
5.
40
7.
Instances of Report created during the migration of individual source les are collected and passed as an argument to the
templating engine Scalate, which is responsible for creating
HTML report using the predened template.
4.2
While the parsing of XML conguration les is rather straightforward and only involves quite simple DOM tree manipulation, parsing Java source les and their analysis is much more complicated. It
is necessary to obtain an abstract representation of a Java source le
an abstract syntax tree (AST). Such AST is then recursively searched
for occurrences of vendor-specic constructs, such as types, annotations or imports. For creating AST representations of Java source
les, I use the library javaparser 6 .
In listing 4.1, I demonstrate how Scala pattern matching [18] is
used to recursively browse an AST and search for a specic annotation. The method receives three arguments an instance of CompilationUnit
which is an abstract representation of a single source le; an annotation to search for and a closure to call for all annotations found. The
method recursively searches all types (classes or interfaces) and their
members (elds, constructors, methods and inner classes) if they are
declared with a specic annotation. If such annotation is found, the
closure is called, passing the found annotation as an argument. Such
solution is very exible the client just passes in an arbitrary closure
which is guaranteed to be called for all matching annotations found.
It is a responsibility of the caller7 to do something useful with it.
Since the very same approach is used to search for occurrences
of vendor-specic types in an AST, pattern matching and recursive
functions are of a great use here again. I browse the AST from the
6.
7.
http://code.google.com/p/javaparser/
The callers in the current architecture are the processors.
41
Listing 4.1: Recursively searching the abstract syntax tree for vendor8. It means that the parsing starts with the whole source le, then recurses into all
the types it contains, then to all their elds, methods and constructors which contain various language constructs, such as ifs, loops, anonymous classes, returns,
etc., down to atomic statements like string literals or null expressions.
42
4.3
The application is written in a clean functional style, with emphasis on its extensibility and code clarity. Migration of additional constructs is just a matter of writing a processor (which is very simple
if one of the abstract classes providing support for manipulating the
DOM or browsing an AST is extended) and plugging it into the application by simply conguring the class name of the processor in
the cong.properties conguration le. The current implementation
contains support for migrating almost all the features described in
chapter 2. However, as I noted earlier, both JPA implementations
contain additional features, which were not discussed and the parsing of which is currently not implemented. Therefore, a future development might focus on parsing and migrating additional vendorspecic features, which are not handled in the current implementation.
Additional problem is that the parser used for creating AST support only Java up to version 5. Therefore, parsing of Java source les
which contain new language constructs introduced in Java 6 or Java
7 (such as the diamond operator or try block with resources), leads to
parsing errors. Unfortunately, this is not easy to x because it would
require the change of the parser and very likely also most of the processing logic.
43
5 Conclusion
In the theoretical part of the thesis I was comparing the most popular JPA implementations according to various criteria I selected. An
expected output from any comparison work is a ranking indicating
who the winners and the losers are. Unfortunately, create such ranking is not possible in case of JPA implementations. All implementation have their strengths and weaknesses and it heavily depends
on the needs of the implementer. Hibernate would be the choice for
its large community, high-quality documentation, integration with
many other frameworks and the support provided by JBoss. EclipseLink, however, is the reference implementation of the JPA standard
and its biggest advantage is its great support for stored procedures.
And nally OpenJPA contains the most powerful features in the area
of a schema generation and generation of persistence classes out of
existing database schema. However, as I discovered in chapter 3, if
a project is heavily utilizing native features of any JPA implementation, it is necessary to remember that the migration process from
one provider to another will be difcult and will require signicant
changes in the code base.
Practical part involved creating an application providing support
for migrating a project from one JPA provider to another. The application is developed in the Scala programming language and is called
JPA migration. It goes through all the sources contained in the source
directory, searches for vendor-specic features and tries to migrate
them automatically either to the standard API or to Hibernate if there
is not any standard alternative. If it is not possible to it automatically,
it displays a hint how could the particular vendor-specic feature be
migrated manually, as well as a link to the corresponding chapter in
the documentation.
45
46
47
A.2 OpenJPA
CREATE SEQUENCE sample_sequence START WITH 1 CACHE 50;
CREATE TABLE "user" (user_id BIGINT NOT NULL, enabled
BOOL, username VARCHAR(255) NOT NULL, PRIMARY KEY
(user_id), CONSTRAINT U_USER_USERNAME UNIQUE
(username));
48
A.3 EclipseLink
50
51
Bibliography
[1] R. Ramakrishnan and J. Gehrke, Database management systems. McGraw-Hill international editions: Computer science
series, McGraw-Hill, 2003.
[2] A. Silberschatz, H. Korth, and S. Sudarshan, Database System
Concepts. McGraw-Hill, 2010.
[3] R. Stephens, Beginning Database Design Solutions. John Wiley
& Sons, 2010.
[4] N. H. Bercich, The evolution of the computerized database,
ARXIV, 2003.
[5] E. F. Codd, A relational model of data for large shared data
banks, Commun. ACM, vol. 13, pp. 377387, June 1970.
[6] C. Bauer and G. King, Java persistence with Hibernate. Manning Pubs Co Series, Manning, 2007.
[7] M. Keith and M. Schincariol, Pro JPA 2: mastering the Java Persistence API. Apress Series, Apress, 2009.
[8] JSR 317: Java Persistence API, Version 2.0. http://jcp.
org/en/jsr/detail?id=317, Dec. 2009.
[9] Hibernate developer guide. http://docs.jboss.org/
hibernate/core/4.0/devguide/en-US/html/.
[10] OpenJPA users guide. http://openjpa.apache.org/
builds/2.1.1/apache-openjpa/docs/manual.html.
[11] EclipseLink 2.3 API documentation. http://www.
eclipse.org/eclipselink/api/2.3/index.html.
[12] Hibernate ORM 4.0 API documentation. http://docs.
jboss.org/hibernate/orm/4.0/javadocs/.
[13] EclipseLink project wiki. http://wiki.eclipse.org/
Category:EclipseLink/Documentation/JPA.
52
53