Sunday, November 11, 2007

What's New in Tally-Ho?

And just where is that 1.0 release, anyway?

Well, I made good on my threat to rip out Toplink Essentials and replace it with OpenJPA. OpenJPA is a bit more pedantic about some things. For example, this code would run fine in Toplink but would throw an IllegalStateException in OpenJPA:

entityManager.getTransaction.begin();
entityManager.close();


While I was working on dropping in OpenJPA, I decided that I really wanted my tests to pass from within Maven, so I could be sure that run-time enhanced (woven) classes were all going to work nicely. I also wanted to make sure that none of the tests depended on any data to be in the database that wasn't put there by the SQL scripts to initialize the DB. So I modified my base test class to perform one-time database wiping/initialization prior to running any tests. This exposed a great many flaws in tests that I wrote in a fairly lazy fashion to assume that certain objects were already present.

After fixing all of that, I decided to let Eclipse clean up a lot of other code for me. Eclipse's Source-->Cleanup feature is very powerful, allowing you to "final" gobs of things and implement the default serial ID for Serializable classes in one giant swoop.

Then I got to work on what the next major project / feature is for Tally-Ho: arbitrary HTML pages. For quite a while I agonized over how to manage associations between pages. In most of the world, if a page changes its name or location, links to it break. I also needed the capability to attach images, PDFs or other documents to arbitrary HTML. It turns out that the solution to both of these problems is the same. In a massive refactoring of BinaryResource, any BinaryResourceReference can now be attached to any other BinaryResourceReference. BinaryResource is gone, and instead the relationship is now 1 BinaryResourceReference has many BinaryResourceReferenceLocales, each of which has one BinaryResourceContent. A BinaryResourceReference may also have many Attachments, which have an sequence number and a reference to the attached BinaryResourceReference. An HtmlPage is just a subclass of BinaryResourceReference with some bits added for the title, keywords, whether to include a message board, etc.

Attachments are numbered in sequence (1, 2, 3). Inside the HtmlPageService, references to attachments are converted by an AttachmentUrlProvider (an interface) and Velocity to their URLs. So if you want to refer to the URL for attachment #1, you use ${1} in the HTML. (Roughly... this bit isn't done yet.) It is up to the AttachmentUrlProvider to decide how to make the URL, given the scope, path and extension.

This refactoring is probably 75% complete. I'm too burnt out on code and too tired to work on it any more this weekend.

So to summarize what's changed:

1. OpenJPA replaced Toplink Essentials
2. Everything builds and tests in Maven (including compile time bytecode instrumentation)
3. Refactoring of binary resources
4. Initial HtmlPage work
5. Binary resource attachment support.

A couple other things I learned today about JPA:

1. If you JOIN multiple things, at least with OpenJPA you need to alias each thing you join. Ie, JOIN x.foo foo JOIN foo.bar bar. The parser will complain if you leave off that last "bar" in that example.
2. Your ability to lazy load ends once the EntityManager you used to load your object is closed. I knew this before and subsequently forgot, and then learned it again the hard way. Merging the entity with a new EntityManager doesn't work either. You need to keep the original one open until you're done navigating your object graph.

Labels: , , , ,


Sunday, November 04, 2007

Time for a Divorce

I've been using Toplink Essentials as the JPA provider for Tally Ho almost exclusively since the project began, except for a quick look at OpenJPA. This is in part because of my experience with the commercial Toplink product- I know that Oracle's Toplink is a mature product (having started out in the early 90's as a Smalltalk persistence provider), and I am comfortable working with it. Unfortunately, the open source Toplink Essentials product does not live up to the promise of Toplink. I've reached the point where I'm tired of coding around its bugs, and now that there are other, healthier projects out there, I shouldn't have to.

That got me to thinking: a lot of us developers use a lot of open source software. How do we choose which packages we want to use? Obviously whatever we choose has to be a good technical fit for our needs... what's the point if it doesn't do the job we're after? But now it occurs to me that open source software has to meet a particular social need as well. One way we can gauge a project's health is by how strong it is socially. How interested are people? Is the project active? Are people excited about the project? Excited enough to fix bugs?

It's a little hard to compare apples to apples in this case, but let's look at a couple things and try to relate them as best we can. Toplink Essentials is maintained as part of the Glassfish project.

In the last 30 days at the time of this writing, the folks working on it have fixed 10 bugs. In that same amount of time, 15 bugs were opened (or changed and left in an opened state). About 53 messages have been posted on the discussion forum. The oldest unresolved bug has been open for about a year and 10 months.

In that same amount of time, the OpenJPA folks have resolved 19 bugs while 20 have been opened. The mailing list has had about 215 posts. The oldest unresolved bug is a year and 4 months old (though it was touched 3 months ago). OpenJPA is using Jira which makes it a bit easier to produce meaningful metrics such that we can find that the average unresolved age of a bug in the last month is about 3 months, which has been fairly consistent.

(I gave up trying to compute the average unresolved age of bugs for Toplink Essentials. It's just too annoying to figure out if the bug tracking tool doesn't do it for you.)

It is probably the case that most open source projects (and probably closed ones too) have a few ancient bugs gathering dust. I think that it's more interesting to look at what a project has been doing recently, like in the last 30-180 days. Are they keeping up with their bug backlog? Is there an active community? Are you likely to get help if you ask for it? Of the bugs that come in, what percentage get fixed and what percentage get dumped in the attic?

And perhaps the most important criteria of all: are they fixing MY bug?

While I wasn't watching, OpenJPA reached a 1.0.0 release. It's available under the Apache 2 license from a Maven 2 repository. They fixed the bug I opened earlier this year (within a day even). It is full-featured and even has an extensive manual. Though, like Toplink, their ant task doesn't work very well.

I used to be concerned about the large number of dependencies that OpenJPA has, but now that the project is building with Maven 2, it's much less of a concern for me. It isn't necessary to go manually fetch anything to build the project, since Maven 2 takes care of all the direct and transitive dependencies. One thing I did have to manually tweak was to force inclusion of commons-collections 3.2 in my pom.xml, because something else in my project depends on an earlier version of commons-collections, and OpenJPA needs a later version.

So it's time to give Toplink one final heave-ho. My reasons for sticking with it have now been outweighed by my need of having compile-time weaving that works and a project where problems are likely to be fixed within my lifetime. It's time for Toplink and I to start seeing other people.

New releases of Tally-Ho will be using OpenJPA as the persistence provider... just as soon as I get all the unit tests passing.

Labels: , , , , , , , ,


Saturday, April 21, 2007

Toplink Essentials: Buggier than a Roach Motel in Pensacola

Working with Toplink Essentials via JPAQL is quite a bit different than working with the commercial version of Toplink using its Expression class. With the commercial Toplink software, you generally get associated 1:1 objects fetched for you (ie eagerly rather than lazily) when you issue a query. In JPAQL, you get exactly what you ask for, which means if you want to get the associated objects in one query, you must use the JPAQL JOIN FETCH operator.

In my case, I needed LEFT JOIN FETCH, which works like an outer (left) join. My query ends up looking like this:

Select x from Article x LEFT JOIN FETCH x.messageBoardRoot where x.createDate > ?1 and not(x.status = ?2) order by x.createDate desc

Sometimes Articles won't have a message board associated with them, though usually they will. For example, there's no point in putting a message board on an article that is in a Pending state, since nobody can see it anyway.

Without the LEFT JOIN FETCH, Toplink issues one query to get the Articles, and then one query for every associated object. So if you're requesting 10 articles, you're going to get 11 queries. With the LEFT JOIN FETCH, it is supposed to consolidate everything into just enough queries to get what you ask for, and in fact the query it issues is reasonable:

SELECT t0.object_id, t0.thumbs_down, t0.spam_abuse, t0.MAILED, t0.change_summary, t0.VISIBLE, t0.ADJECTIVE, t0.BODY, t0.md5, t0.VIEWS, t0.fuzzy_md5_1, t0.VERSION, t0.fuzzy_md5_2, t0.thumbs_up, t0.create_date, t0.TITLE, t0.SUMMARY, t0.STATUS, t0.section, t0.changer, t0.creator, t1.object_id, t1.post_count, t1.last_post, t1.posting_permitted, t1.source_id, t1.post_count_24hr FROM ARTICLE t0 LEFT OUTER JOIN article_message_root t1 ON (t1.source_id = t0.object_id) WHERE ((t0.create_date > ?) AND NOT ((t0.STATUS = ?))) ORDER BY t0.create_date DESC
bind => [2007-04-14 14:46:15.593, P]


Unfortunately, Toplink's behaviour upon handling the results of running this query is NOT reasonable:


java.lang.NullPointerException
at oracle.toplink.essentials.mappings.ForeignReferenceMapping.buildClone(ForeignReferenceMapping.java:122)
at oracle.toplink.essentials.internal.descriptors.ObjectBuilder.populateAttributesForClone(ObjectBuilder.java:2136)
at oracle.toplink.essentials.internal.sessions.UnitOfWorkImpl.populateAndRegisterObject(UnitOfWorkImpl.java:2836)


I've filed this one as https://glassfish.dev.java.net/issues/show_bug.cgi?id=2881. If past behaviour is any indication, the Glassfish people will change the priority on the bug to a P4 and decide not to fix it until we're all very old, despite it being a significant breakage of the API. They even pull that crap when the one-liner fix is already given in the bug report, and it would take longer to reset the priority and update the bug than it would to actually fix the damn problem.

Labels: , , ,


This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]