Monday, October 16, 2006
Fun with JPA
I decided to learn from some of my past mistakes and separate out the model code from the forms code from the business logic. As it stands today, some of the model (which loosely follows the DAO pattern) contains code for form validation.... and some of it doesn't. There's code in the JSPs to handle deciding whether a form is needed, whether validaton was successful, and whether inserting/updating/deleting happened correctly as well as dumping the errors to the page should anything go wrong. Basically, the thing's a mess, and I've learned a lot since those days.
For several years at my day job, I've been working with Toplink. Prior to that, I did some work with Hibernate and found it frustrating and not ready for serious work (though in fairness to Hibernate, this was with early version 2 stuff). It required patching just to make it barely functional in our environment. Toplink, on the other hand, is a very mature product, having its roots in Smalltalk in the early 1990's, and its mapping workbench is many orders of magnitude easier to work with than Hibernate's XML configuration files. (I despise XML configuration files passionately. They are hideous to look upon and even harder to read and impossible to work with. Or at least impossible for me. I have no tolerance for that garbage at my age.)
As an aside, the Hibernate documentation once answered the FAQ, "Why doesn't Hibernate have a mapping GUI?" with this:
Because you don't need one. The Hibernate mapping format was designed to be edited by hand; Hibernate mappings are both extremely terse and readable. Sometimes, particularly for casual / new users, GUI tools can be nice. But most of the time they slow you down. Arguably, the need for a special tool to edit mapping documents is a sign of a badly-designed mapping format. vi should be a perfectly sufficient tool.
And if you believe that, I have a nice bridge for sale that I'd like to tell you about. Incidentally, I like vi as much as the next guy, but the statement "vi should be a perfectly sufficient tool" calls into question the very sanity of whoever wrote that answer. Twiddling with XML configuration files by hand is not a reasonable or sane way to manage an object model. Maybe there are some people who get a little ego stroking and feel terribly studly and elite working that way, but I am waaaaay too old for that nonsense.
But I digress.
The main thing holding me back from using Toplink in a restructuring of the morons.org back-end was its rather expensive commercial license. Luckily, EJB3 Persistence, now better known as JPA (the Java Persistence API) achived its first release this year, and the reference implementation was done by Oracle (which bought Toplink a few years back) and is largely Toplink code. Around the same time, FreeBSD got an official binary Java 1.5 release, which made annotations available. Annotations aren't required for working with JPA (you can use... a big honking XML configuration file!), but they're by far the easier and cleaner way.
So it was easy to decide to use JPA to manage the object model. I also chose Wicket to handle the view and controller components of the system. Wicket is surprisingly non-stupid, which is a heck of a lot more than I can say for frameworks like Struts. The trouble with a lot of these MVC frameworks is that for the sake of saving you perhaps two
ifstatements, they introduce 30 lines of convoluted XML configuration with a similarly complex API to go with it. I've always believed that frameworks should save you time. If a framework's design is such that complexity is increased-- that is, you replace a few lines of flow control logic written in Java with a large API with interfaces you must implement and XML configuration you must write-- ostensibly for cleanliness that in the end is not realized, then you haven't got a good framework... you've got an exercise in neurotic time-wastery that would turn off even most obsessive-compulsive autistic people. Somehow that doesn't stop some folks.
But I digress.
I also decided to use Sitemesh to handle the overall site layout, abandoning my old system of a customized XSLT filter. Realistically, there are two main categories of browser that access my site anymore: Microsoft Internet Explorer 6 and Firefox 1.5. There are a pittance of other browsers like Safari, Kmeleon, a couple Opera users and the occasional text browser or HTML fetcher. I get a rare hit from an old version of Netscape. The bottom line is that I no longer see value in designing customized XSLT stylesheets for ancient browsers that rarely hit my site. Also the XSLT is (relatively) slow. Sitemesh offers a way of using different files for merging for IE versus Firefox. Currently Firefox is the only browser I know of that can handle exclusively using
DIVfor layout and that handles
display: blockcorrectly. So I will take advantage of Sitemesh to give table-based layout to IE and
DIV-based layout to Firefox. The latter saves a bit of bandwidth and is really the better way to do things.
For my code structure I decided I'd put my model in one package, my form handlers in another package, and a service layer in a third package. One discussion that happens with every new project is "where does the business logic go?" and "what operations will exist in the object model?" The trouble with putting intelligence in your object model is that these things tend to be auto-generated by mapping tools and you have to figure out some way to get this business code re-injected each time you generate the model. Further compounding the issue, many business operations involve complex relationships between multiple classes from the object model; how do you decide which part of the model in which to include these operations? Moreover, do operations of that nature even belong in the object model?
I elected to find the most reasonable compromise I could, which was to put these business operations in a service layer, which acts as an intermediary between things like GUI forms and the data model. In this way, business operations are kept out of the GUI, and things which wish to interact with the model are decoupled from directly operating on it.
(Mind you, this is not a service layer implemented as a web service, though it would be easy to tack on a web service using the service layer classes. Web services are nice for some things, but they would add an unnecessary level of indirection, complexity and latency.)
Having made all of these decisions, it was time to begin. I decided to start with the model, since that was the area where the most things could go wrong. And go wrong they did! The trouble with using a bleeding-edge technology like JPA is that sometimes things are bleeding because they've been cut by a sharp object, which will cut you if you aren't careful.
I decided to try the Dali plug-in for Eclipse. Dali helps you create object-relational mappings by marking up your code with annotations and providing a simple interface for supplying the right annotations and the right data to go in those annotations. Unfortunately, Dali is at version 0.5 and doesn't always play as nicely as it could with Eclipse and persistence. Dali also lacks support for PostgreSQL, a glaring oversight considering the popularity of that open-source database (and the one I use). No matter; it does support generic JDBC. I was able to create entities based on classes in my database, with the caveat that some types did not translate well to native Java types (they probably would have if Dali supported PostgreSQL). For example, a few
booleandatabase types became Java
longs. These I edited by hand.
I had to struggle a little with the mapping between an Article and a MessageBoardRoot. In the previous incarnation of the morons.org backend code, entities become associated with their corresponding message boards through a table called "roots" which records the entity's type and primary key. For example, a board with the root_id 23654 might be associated with an "article" with the primary key "6745". The composite of "article" and "6745" is unique and associated to all of the messages bearing the root_id of "23654." I wanted to map this as a 1:1 relationship so I could ask an Article for its MessageBoardRoot. Unfortunately, JPA does not allow you to create a join based on a primary key and a constant; only persisted fields can participate in the join. So it was either include in every single article a field with the word "article" in it, or make every object on the entire system have a unique ID and drop the "root_type" field from the roots table. I decided on the latter; every object on the system would share a single sequence generator and name the primary key the same thing: object_id. This meant I would have to renumber some old data, but I think the change is worth it. I can also still record the creating class name in the MessageBoardRoot objects for the sake of being able to easily locate the corresponding class for administrative purposes later.
Then I tried to persist the simplest record I could- an Account, which corresponds to a user login account. JPA exploded immediately with a NullPointerException:
Exception in thread "main" java.lang.NullPointerException
Ah, I just love the intuitive errors provided by Open Sores code sometimes. After failing to find a conclusive answer with Google, I came to realize that this error occurs when JPA is unable to locate the persistence.xml file, which normally resides inside the
META-INFdirectory of a jar file, but may also reside in
META-INFin a directory on your classpath. After a lot of digging, and using FileMon from Sysinternals.com, I discovered the problem: JPA prepends
META-INFonto your classpath entries, so if you think you're going to add your
META-INFdirectory to your classpath, you're barking up the wrong tree. You actually need one directory above it.
Normally, Dali would have taken care of this for me by putting
META-INFin the right place. What happened was that I had added Java Persistence to the Eclipse project before I added a source folder to the project. Eclipse apparently drops the project root as a source folder after you add your first source folder to a project, which meant that my classpath entry for source went from
net.spatula.news/src/java, but the
META-INFdirectory did not move along with that change. I elected to just have Dali create a new persistence.xml file for me by removing the persistence nature from
.classpathand adding Java Persistence to the project again. This got me past the NullPointerException-- progress!
The next trouble was with sequences. Whereas Dali does not support PostgreSQL, Toplink Essentials (aka Glassfish persistence) does... sort-of. It turns out that sequence generation for primary keys for PostgreSQL is currently broken in Toplink Essentials. If you specify a sequence generator for a primary key field that isn't of type
serial, Toplink completely ignores sequence generation and attemps to insert a null for the primary key. The workaround is not to specify a sequence generator and instead set up a table for the default sequence generation strategy. This means issuing the commands
create table sequence(seq_name varchar(40) not null primary key, seq_count integer not null);
insert into sequence values('SEQ_GEN', 1);
I anxiously await Toplink correcting sequence generation for PostgreSQL.
My next challenge was a new exception: "cannot persist detached object." In my Account model, an Account has a member called "changer" which is also an Account. The idea is that whoever caused the change in the model will be recorded as the changer. (I use triggers on the database to keep an audit trail for every record change in the account table.) The mapping is fairly straightforward- the changer field maps to the object_id field in the same table. I had created a row in the account table for an account called "System" with an object_id of 0 to bootstrap the table and handle initial inserts. It turned out that my choice of object_id for this special account was the culprit in the exception. Dali mapped the object_id as a Java native type
longrather than an object type
Long. This means that the only way JPA can tell whether an object is new (has a primary key) is whether the primary key is non-zero, since native types default to 0 (except booleans which default to false). JPA thought that I was trying to persist a new "changer" object by reachability because the primary key was set to 0. I haven't checked yet, but I suspect that were the primary key mapped as an object rather than a native type, that determination would have been done based on whether the reference was null, and 0 would then be a legitimate primary key. The solution was to give System an object_id of 1, rather than 0.
I also discovered that JPA has one weird deficiency: there's no way to annotate that an entity is read-only. This is desirable in situtions like the audit trail example above; one might want to read these audit records but prevent the developer from accidentally writing back a change or inserting into the audit table directly. One partial workaround is to specify that the primary key field is read-only.
After all of this, I managed to get a row inserted in the database. The next stop: getting it back out again.
"I was able to create entities based on classes in my database, with the caveat that some types did not translate well to native Java types (they probably would have if Dali supported PostgreSQL). For example, a few chars and boolean database types became Java longs. These I edited by hand".
I ran into this blog while googling for a solution to this very problem. I wrote to the guys in dali-dev asking about it and got some nice feedback. The following thread might be of interest:
Subscribe to Posts [Atom]