Sunday, November 04, 2007

Time for a Divorce

I've been using Toplink Essentials as the JPA provider for Tally Ho almost exclusively since the project began, except for a quick look at OpenJPA. This is in part because of my experience with the commercial Toplink product- I know that Oracle's Toplink is a mature product (having started out in the early 90's as a Smalltalk persistence provider), and I am comfortable working with it. Unfortunately, the open source Toplink Essentials product does not live up to the promise of Toplink. I've reached the point where I'm tired of coding around its bugs, and now that there are other, healthier projects out there, I shouldn't have to.

That got me to thinking: a lot of us developers use a lot of open source software. How do we choose which packages we want to use? Obviously whatever we choose has to be a good technical fit for our needs... what's the point if it doesn't do the job we're after? But now it occurs to me that open source software has to meet a particular social need as well. One way we can gauge a project's health is by how strong it is socially. How interested are people? Is the project active? Are people excited about the project? Excited enough to fix bugs?

It's a little hard to compare apples to apples in this case, but let's look at a couple things and try to relate them as best we can. Toplink Essentials is maintained as part of the Glassfish project.

In the last 30 days at the time of this writing, the folks working on it have fixed 10 bugs. In that same amount of time, 15 bugs were opened (or changed and left in an opened state). About 53 messages have been posted on the discussion forum. The oldest unresolved bug has been open for about a year and 10 months.

In that same amount of time, the OpenJPA folks have resolved 19 bugs while 20 have been opened. The mailing list has had about 215 posts. The oldest unresolved bug is a year and 4 months old (though it was touched 3 months ago). OpenJPA is using Jira which makes it a bit easier to produce meaningful metrics such that we can find that the average unresolved age of a bug in the last month is about 3 months, which has been fairly consistent.

(I gave up trying to compute the average unresolved age of bugs for Toplink Essentials. It's just too annoying to figure out if the bug tracking tool doesn't do it for you.)

It is probably the case that most open source projects (and probably closed ones too) have a few ancient bugs gathering dust. I think that it's more interesting to look at what a project has been doing recently, like in the last 30-180 days. Are they keeping up with their bug backlog? Is there an active community? Are you likely to get help if you ask for it? Of the bugs that come in, what percentage get fixed and what percentage get dumped in the attic?

And perhaps the most important criteria of all: are they fixing MY bug?

While I wasn't watching, OpenJPA reached a 1.0.0 release. It's available under the Apache 2 license from a Maven 2 repository. They fixed the bug I opened earlier this year (within a day even). It is full-featured and even has an extensive manual. Though, like Toplink, their ant task doesn't work very well.

I used to be concerned about the large number of dependencies that OpenJPA has, but now that the project is building with Maven 2, it's much less of a concern for me. It isn't necessary to go manually fetch anything to build the project, since Maven 2 takes care of all the direct and transitive dependencies. One thing I did have to manually tweak was to force inclusion of commons-collections 3.2 in my pom.xml, because something else in my project depends on an earlier version of commons-collections, and OpenJPA needs a later version.

So it's time to give Toplink one final heave-ho. My reasons for sticking with it have now been outweighed by my need of having compile-time weaving that works and a project where problems are likely to be fixed within my lifetime. It's time for Toplink and I to start seeing other people.

New releases of Tally-Ho will be using OpenJPA as the persistence provider... just as soon as I get all the unit tests passing.

Labels: , , , , , , , ,


Saturday, November 03, 2007

How to do Static Weaving with Toplink Essentials from Maven 2

I've always wanted to make sure that the build process for Tally Ho is as seamless as possible, requiring minimal configuration by someone who downloads the monster. Currently, I do have one small step that people have to do- they must install the Toplink Essentials jar files in their local repository. This is necessary because of a licensing issue from Oracle apparently... you have to agree to a license before their jar will unpack. Fortunately, this is a fairly simple procedure, which I cover in README-MAVEN thusly:


toplink-essentials-V2_build_58: Get this from
https://glassfish.dev.java.net/downloads/persistence/JavaPersistence.html.
Download the jar, then run java -jar glassfish-persistence-installer*.jar,
accept the license agreement, and the installer will create a
glassfish-persistence directory. Change into that directory and run
mvn install:install-file -Dfile=toplink-essentials.jar \
-DgroupId=toplink-essentials -DartifactId=toplink-essentials \
-Dversion=V2_build_58 -Dpackaging=jar




So far in the project I haven't used static weaving, because it hasn't been all that vital. But an upcoming change in the way binary resources work requires it. (I need to be able to refer to a binary resource without loading all of that resource's data, since that could conceivably be a tremendous amount of data and a huge memory hog... this requires a lazy-loaded 1:1 relationship (Toplink Essentials does not support lazily loaded compositions)).

The Toplink folks do not provide a Maven 2 plugin for doing static weaving (and you must use static weaving if you're deploying to a J2SE servlet container due to classloader constraints on javaagent). They do, however, provide an ant task for static weaving, and Maven 2 can run ant tasks.

It took some time to figure out how to get the Maven dependency classpath into the Ant task so that the Ant task could find the class for static weaving, but after some digging I found the answer.

First, we define a build.xml for the ant task, to keep from severely uglifying pom.xml:
<project name="Weaver" default="weaving" basedir=".">

<description>
Run the ant task for performing static weaving on model classes. This
is meant to be run from m2 with the compile_classpath variable set.
</description>

<target name="define.task" description="New task definition for toplink static weaving">
<taskdef name="weave" classname="oracle.toplink.essentials.weaving.StaticWeaveAntTask">
<classpath>
<path path="${compile_classpath}" />
</classpath>
</taskdef>
</target>

<target name="weaving" description="perform weaving" depends="define.task">
<echo>Performing static weaving on model classes

<weave source="target/classes" target="target/classes" persistenceinfo="src/main/resources">
<classpath>
<path path="${compile_classpath}"/>
</classpath>
</weave>
</target>

</project>



The Eclipse ant task editor will of course complain that the taskdef class cannot be found, but that's okay because we don't intend to run this with ant. We're going to run it from Maven2, using the ant task runner.

One nice thing about Maven 2 is that they've included a phase just for post-processing of classes, which is the ideal place to hook into the compilation process. We add this to our pom.xml inside the <build><plugins>:
      <plugin>
<groupId>org.apache.maven.plugins
<artifactId>maven-antrun-plugin
<executions>
<execution>
<id>process-classes
<phase>process-classes
<configuration>

<echo>Beginning process-classes phase...
<property name="compile_classpath" refid="maven.compile.classpath"/>
<ant antfile="${basedir}/build.xml">
<target name="weaving"/>
</ant>
</tasks>
</configuration>
<goals>
<goal>run
</goals>
</execution>
</executions>
</plugin>



Maven creates an ant property called compile_classpath which dereferences to maven.compile.classpath property, which includes all of the compile-time dependencies declared in the pom.xml. In this case, since the pom already contains the Toplink Essentials jar file, the compile classpath will contain the class needed to run the ant taskdef.

There are, of course, still bugs with static weaving. The weaver still breaks if there is a space in the path to your classes and it still incorrectly weaves classes with lazy 1:1 relationships, failing to add some required methods for 1:1 fields that aren't lazy loaded. As these bugs haven't been touched since March of this year, I don't hold out a lot of hope for seeing them fixed any time soon.

The workaround for the first problem is to move your Eclipse workspace (or other working directory) to a path with no spaces in the name. The second can be worked around by using property access on lazy-loaded fields, although that is lame and stupid.

What kills me about that latter bug is that in the comments, the person the bug is assigned to describes exactly what needs to be done to fix the bug and where in the code to fix it. This means he had to have been looking around in the code to find it. And once he found it, rather than just fixing the damn problem, then saving and committing, he talked about how to fix it on the bug report instead. And indeed, one can still go look at the code and see how it's still broken to this day, when it could have been resolved 8 months ago for less effort than it took to write about it. It's literally a one-line fix. Maybe even half-a-line, if you want to get technical. Select some text, hit backspace, ctrl-S, run unit tests, commit.

Incidentally, it also turns out that the Toplink ant task does absolutely nothing at present. If you find that troubling, you can swap out the <weave> task for a kludge like this:

  <java classname="oracle.toplink.essentials.weaving.StaticWeave" >
<classpath>
<path path="${compile_classpath}"/>
</classpath>
<arg value="-persistenceinfo" />
<arg value="src/main/resources" />
<arg value="-loglevel" />
<arg value="finest" />
<arg value="${target_directory}" />
<arg value="${target_directory}" />
</java>



At least this way the ugliness is encapsulated in an ant build.xml file, and some day when the ant task gets fixed, you can return to using it easily.

Labels: , , , ,


Tuesday, August 14, 2007

Toplink Query Deficiencies

In any relational setting, it is wise to avoid this situation if you can:

TABLE_ONE
---------
object_id serial primary key not null
foo_id integer not null references TABLE_TWO(object_id)
bar_id integer not null references TABLE_TWO(object_id)

TABLE_TWO
---------
object_id serial primary key not null
some_field varchar(80) not null

There is no way to perform a single join from table_one to table_two to get a complete set of information, because of the references to multiple different rows in table_two. It's probably better to reorganize the relationship if you can.

The exception is if you don't necessarily need both fields. For example, if you can do without knowing anything about bar_id in most cases, you could always lazy load that field. That is, if lazy loading for 1:1 fields works in your JPA provider.

It still doesn't work with Toplink, and these bugs are STILL open:

https://glassfish.dev.java.net/issues/show_bug.cgi?id=2546
https://glassfish.dev.java.net/issues/show_bug.cgi?id=2554

The consequence is that I cannot static weave my model, so my only choice for now is to mark the fields as @Transient and ignore them for now. (I have a feature that tracks the changes made to every entry in a table with a changer that references an Account... for now, I just won't track the changer until I have time to come up with a better way of doing it or the glassfish folks fix their shit.)

Labels: , , ,


Wednesday, August 01, 2007

Another Dumbshit Toplink Error Message

When Toplink says "Trying to get value for instance variable [foo] of type [bar] from the object The specified object is not an instance of the class or interface declaring the underlying field" what it means is you specified the wrong mappedBy class in your many:many relationship, either explicitly or by inference from your generic type.

But it's much, much clearer to give their error message, isn't it.

Labels: ,


Saturday, April 21, 2007

Toplink Essentials: Buggier than a Roach Motel in Pensacola

Working with Toplink Essentials via JPAQL is quite a bit different than working with the commercial version of Toplink using its Expression class. With the commercial Toplink software, you generally get associated 1:1 objects fetched for you (ie eagerly rather than lazily) when you issue a query. In JPAQL, you get exactly what you ask for, which means if you want to get the associated objects in one query, you must use the JPAQL JOIN FETCH operator.

In my case, I needed LEFT JOIN FETCH, which works like an outer (left) join. My query ends up looking like this:

Select x from Article x LEFT JOIN FETCH x.messageBoardRoot where x.createDate > ?1 and not(x.status = ?2) order by x.createDate desc

Sometimes Articles won't have a message board associated with them, though usually they will. For example, there's no point in putting a message board on an article that is in a Pending state, since nobody can see it anyway.

Without the LEFT JOIN FETCH, Toplink issues one query to get the Articles, and then one query for every associated object. So if you're requesting 10 articles, you're going to get 11 queries. With the LEFT JOIN FETCH, it is supposed to consolidate everything into just enough queries to get what you ask for, and in fact the query it issues is reasonable:

SELECT t0.object_id, t0.thumbs_down, t0.spam_abuse, t0.MAILED, t0.change_summary, t0.VISIBLE, t0.ADJECTIVE, t0.BODY, t0.md5, t0.VIEWS, t0.fuzzy_md5_1, t0.VERSION, t0.fuzzy_md5_2, t0.thumbs_up, t0.create_date, t0.TITLE, t0.SUMMARY, t0.STATUS, t0.section, t0.changer, t0.creator, t1.object_id, t1.post_count, t1.last_post, t1.posting_permitted, t1.source_id, t1.post_count_24hr FROM ARTICLE t0 LEFT OUTER JOIN article_message_root t1 ON (t1.source_id = t0.object_id) WHERE ((t0.create_date > ?) AND NOT ((t0.STATUS = ?))) ORDER BY t0.create_date DESC
bind => [2007-04-14 14:46:15.593, P]


Unfortunately, Toplink's behaviour upon handling the results of running this query is NOT reasonable:


java.lang.NullPointerException
at oracle.toplink.essentials.mappings.ForeignReferenceMapping.buildClone(ForeignReferenceMapping.java:122)
at oracle.toplink.essentials.internal.descriptors.ObjectBuilder.populateAttributesForClone(ObjectBuilder.java:2136)
at oracle.toplink.essentials.internal.sessions.UnitOfWorkImpl.populateAndRegisterObject(UnitOfWorkImpl.java:2836)


I've filed this one as https://glassfish.dev.java.net/issues/show_bug.cgi?id=2881. If past behaviour is any indication, the Glassfish people will change the priority on the bug to a P4 and decide not to fix it until we're all very old, despite it being a significant breakage of the API. They even pull that crap when the one-liner fix is already given in the bug report, and it would take longer to reset the priority and update the bug than it would to actually fix the damn problem.

Labels: , , ,


Tuesday, March 06, 2007

Toplink Essentials: Not Ready for Prime Time

I really really like the Toplink commercial product. I think it's tremendously powerful, flexible, well-supported, and rock-solid. It does exactly what it is supposed to do, every time.

I wish I could say the same for Toplink Essentials. It feels like the open source branching was done as a rush-job, leaving large portions of the necessary code and tooling undone. My simple test case to take advantage of "weaving" (bytecode instrumentation) fails. My attempt to make the instrumentation work using static weaving (post-compile and pre-run) also fails, but in a different way. These are not elaborate test cases. My code isn't terribly complex yet, and my schema is fairly straightforward. Yet Toplink falls down and goes splat. Some of their tools can't even handle spaces in the pathname.

For now I'll keep using OpenJPA until the Glassfish people can make Toplink Essentials stop sucking so damn much. OpenJPA is a bit more pedantic anyway, and that's a good thing. My code should end up tighter as a result.

Much of this JPA stuff just isn't ready for the real world yet, though it's been a final spec since last May and a proposed final draft since December of 2005. They don't even have support for @OneToOne in Eclipse WTP 2.0 yet, and every edit to the persistence.xml file brings up some lame dialog box whining about a ConcurrentModificationException. Granted, it's just a milestone build. But this is the type of frustration one encounters when trying to use JPA.

And no, I'm not going to use sucky Hibernate and their dozens of goddamn dependencies, lack of integrated object caching, and mental-defective nerd community.

Labels: , , ,


Sunday, March 04, 2007

Toplink's Weaving is Broken

I'm reminded tonight of why I hate things that happen by "magic" in code. They are great when they're working correctly, but an impossible pain in the ass to debug when things go wrong. This is the primary reason why I dislike AOP; how do you debug something that isn't in your code? Toplink's "weaving" is very AOP-ish; it's effectively intercepting Java bytecode for setting the values for fields.

Or rather, it's not doing it very effectively, if I may overload usage of the word, because it's totally broken in a very simple and stupid way which even a primitive test case should reveal. It fails to correctly instrument put field operation for a field that has an associated Entity. In this case, that means that when I try to set my MessageRoot on my Message, I get this:

java.lang.NoSuchMethodError:
net.spatula.tally_ho.model.Message._toplink_setroot(Lnet/spatula/tally_ho/model/MessageRoot;)V
at net.spatula.tally_ho.model.Message.setRoot(Message.java:169)


As you can probably imagine, I did not write a method called _toplink_setroot. The problem is, neither did Toplink, even though it should have.

Since the Toplink instrumentation code doesn't appear to do any logging of any kind, it is utterly impossible to debug this problem. I have exactly nothing at all to go on, other than the name of a method that doesn't exist, which Toplink should have supplied if it wanted to call it.

Much as I wanted to use Toplink for this project, this problem is a complete deal-breaker. What's particularly startling to me is that such a simple test case can fail without anyone noticing.

Next I guess I'll try BEA's Open JPA (Kodo) and see if they can manage to do proper lazy loading of a 1:1 relationship AND let me create that relationship too... since that's too much to ask of Toplink.

Labels: , , , , , ,


Saturday, March 03, 2007

Optimizing the Message Tree with JPA

The trouble with recursive tree structures is that databases tend to give poor performance implementing them. Further, JPA doesn't really natively do trees (so far as I can tell). Fortunately, morons.org solved this problem a very long time ago. My solution was to associate each collection of messages with a single "root" (though it isn't a root in the true sense of the word) and with each other. If you happened to have "connect by" syntax with your database, you could take advantage of it by virtue of the parent_id column. If you didn't, you could always do a fairly quick pass through the full collection of messages you got off the single root. The latter is exactly what Tally Ho will be doing with JPA.

Key to making this work is denoting the 1:1 relationship to the parent as fetch=FetchType.LAZY.

When I first tried this with Glassfish Persistence (aka Toplink Essentials) it seemed to be ignoring my request to use FetchType.LAZY. As soon as the first object loaded, it loaded its parent. And the parent of that object. And so on, recursively. Toplink would issue one SELECT statement for every single object in the database for the "root" in question. Obviously the performance of this strategy really sucked.

It turns out that for proper behaviour you need to turn on something the Toplink folks call "weaving", which I suspect is actually just a fancy Oracle term for Java bytecode instrumentation. To do this, you add the property toplink.weaving to persistence.xml and set its value to true. You must also add the JVM argument -javaagent:{path to libs}/toplink-essentials-agent.jar to your running configuration.

Once I did this, Toplink would happily wait to fetch any parent objects until they were asked for. And by virtue of all parent/child relationships being collected inside the same "root", when you first ask for a parent, it has already been loaded in memory, so an additional trip to the database is not required.

The downside is that the "children" List on the Message won't get populated, and if you let JPA handle the population, it will again make a lot of trips to the database to figure out which children belong to which parent... it has no way of knowing my rule that all of the relationships among the children are contained within the confines of that one root.

Fortunately for the needs of Tally Ho, this problem is easily mitigated. We simply mark the children List with @Transient to prevent JPA from trying to do anything with it at all. Then we pull this trick in MessageRoot:

public List getMessages() {
if (!messagesFixed) {
fixMessages();
}
return messages;
}

private void fixMessages() {
for (Message message : messages) {
if (message.getParent() != null) {
message.getParent().getChildren().add(message);
}
}
messagesFixed = true;
}


The only reason we can get away with this is that we never look at a collection of Messages without getting it from the MessageRoot. In the cases where we do look at a single Message, we don't look at its children.

It's a tiny bit hackish, but the performance improvement is tremendous, so it's worth it. Over the Internet, through my ssh tunnel to my server, 180 message objects were loaded and linked together correctly in about 650ms. I suspect this will improve substantially when the database is on the local host.

Labels: , , ,


This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]