Locating JIT-generated machine code using gdb

This article describes how I was able to locate machine instructions generated by OpenJDK JIT compiler on Linux using gdb debugger. It may be useful to those who want to understand what JIT does with your code.I used this procedure to find that JIT emits efficient branchless code, as described here. I decided to share this information because I spent considerable amount of time figuring it out. It’s possible that easier approaches exist – please feel free to suggest.

First, I used these 2 options with java command: -XX:+AggressiveOpts -XX:+PrintCompilation

AggressiveOpts forces JVM to compile most everything it sees into native instructions. PrintCompilation makes it report what it does compile. With these 2 options together, you have a great chance that your target function will be compiled and you get a log statement to confirm that it does.

Next, I found that it is best to place a breakpoint just before your target section of code. But how does one set a breakpoint? Java breakpoints are useless for native code debugging, because Java debuggers don’t give you an opportunity to jump from Java code into native instructions. A breakpoint in Java interpreter code in JVM would be triggered countless number of times while JVM interprets your Java program. I found this to be impractical even for tiny handcrafted sample programs. Answer: you create a simple NOOP native function and set breakpoint in it!

I took inspiration from Fahd’s core dumper code He describes how to invoke native code from Java to cause a core dump (by deliberately dereferencing null pointer). We can take this example, but eliminate the core dump. Our function will do absolutely nothing – it will simply serve as a hook to hang the debugger breakpoint on.

The steps will be:

1. Create a Java class:


public class Hookable {

// load the library

static {

System.loadLibrary("nativelib");

}

// native method declaration

public native void here();

public static void main(String[] args) {

// any prep work

new Hookable().here();

// Your target code goes here

}

}

2. Compile your class with javac


javac Hookable.java

3. Generate JNI header file


javah -jni Hookable

4. Write C code for the function that you will use a breakpoint in gdb:


#include "Hookable.h"

void foo() {

int a=2;

return;

}

JNIEXPORT void JNICALL Java_Hookable

(JNIEnv *env, jobject obj) {

foo();

}

5. Compile the C code


gcc -fPIC -o libnativelib.so -shared -I$JAVA_HOME/include/linux/ -I$JAVA_HOME/include/ Hookable.c

6. Launch gdb to runJava under its control:


gdb --args java -XX:+AggressiveOpts -XX:+PrintCompilation Hookable

When gdb starts, add a breakpoint for foo. gdb will complain, because Java is not yet running, so foo can’t be found yet. Confirm that you really want to add this breakpoint by responding “y” to the prompt:


(gdb) break foo

Function "foo" not defined.

Make breakpoint pending on future shared library load? (y or [n]) <b>y</b>

Breakpoint 1 (foo) pending.

Run Java:


(gdb) run

Starting program: /usr/bin/java -XX:+AggressiveOpts -XX:+PrintCompilation Hookable

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

[New Thread 0x7ffff7fce700 (LWP 7336)]

…

gdb will stop at the breakpoint in foo:

Breakpoint 1, 0x00007fffe48c96ff in foo ()

from /home/ivan2/dev/benchmarks/libnativelib.so

(gdb)

Now would be a good time to set up your debugging session. For example, tell gdb to display next machine instruction to be executed:


(gdb) display/i $pc

1: x/i $pc

=> 0x7fffe48c96ff <foo+4>: movl $0x2,-0x4(%rbp)

(gdb)

Exit foo and Java_Hookable, the 2 frames that exist solely as a marker and a place to hang breakpoint on


(gdb) fin

Run till exit from #0 0x00007fffe48c96ff in foo ()

(gdb) fin

Run till exit from #0 0x00007fffe48c9723 in Java_Hookable_here ()

Your debugging session is now stopped at the point marked “// Your target code goes here”. Next debugger step wioll take you out of native C code and back into Java realm, exactly where you want to be. Happy hunting for JIT output!

Confirmed: OpenJDK JIT compiler emits efficient conditional move machine codes

On popular programmers’ Q&A site stackoverflow.com the hit question of the last few years dealt with the effect of branching on program performance. I recommend that you read the brilliant explanation there, but if you don’t have much time, here’s the summary. Modern processors execute commands in a pipeline. While One command is finishing executing, the next one is already started, the next one is being prepared and so on. Pipelining can only work if processor knows which command will execute next. If the program is linear, determining next command is trivial. But if the program branches (think “if” statement), processor can only guess what comes next. If the processor guesses wrong, it would have to back out several steps of the wrong branch it took. Such backout would severely degrade performance.

Fortunately, modern processors have branch prediction mechanisms, which allow them to guess branching correctly when a particular branch exhibits consistent behavior. An example would be an if statement inside a loop. If code takes “then” branch consistently, processor will be able to make correct guess and performance will remain optimal.

So efficient code would avoid branching or at least make it branching predictable. One could assume that to write efficient code, programmer needs to be aware of branch prediction and code around it. An example in the linked Stack Overflow article shows how sorting data before processing it in a loop creates consistent branching and dramatically improves performance. Another trick would be to replace mini-branches with equivalent branchless code. For example, function min(a,b) is normally written as

if (a<b) return a; else return b; 

It could rewritten (for Java’s 32 bit signed ints, with an assumption on input range not causing overflow/underflow) as

return a-((b-a)>>31&1)*(a-b)

Should you write code in this obscure style to gain performance? I’m going to show that in Java this trick is unnecessary.

The key to the problem lies in the fact that in some cases, an if statement doesn’t produce branching! X86 processors (Intel and AMD) that power most PCs and great many servers have recognized Conditional Move instructions in Pentium Pro, released about 20 years ago in 1995. Conditional move instruction is a way to achieve branching effect without branching code. The meaning of this instruction is “move value from one place to another IF specified condition is met, do nothing otherwise”. This instruction can be used to determine the minimum of 2 values or to express a ternary operator (?:) in C or Java.

The Stack Overflow post cited benchmark results that suggest that C compilers are not consistent in emitting these efficient conditional move instructions. Java seemed to use conditional move, based on performance results. I ran a slightly different benchmark (computing minimum of 2 numbers using if-return and ternary operator), comparing run times between branch-friendly ordered input data and noisy random input data. There was no difference in execution times, suggesting that branching was not a factor.

I decided to check what instructions OpenJDK JIT compiler generates. For my tests I used Java 7 on my home Ubuntu Linux running on 64 bit AMD processor.


ivan2@bigbox:~$ java -version

java version "1.7.0_79"

OpenJDK Runtime Environment (IcedTea 2.5.5) (7u79-2.5.5-0ubuntu0.14.04.2)

OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)

I wrote a primitive test case with simple min function:


public int ifmin(int a, int b) {

if (a<b) return a; else return b;

}

My goal was to confirm that this simple code will be compiled to native instructions as a conditional move (CMOVxx), not a conditiona jump (Jxx). My primitive function compiled to this Java bytecode:


0: iload_0

1: iload_1

2: if_icmpge 7

5: iload_0

6: ireturn

7: iload_1

8: ireturn

I caused JVM to compile the bytecode to native instructions and traced execution with gdb debugger. By the way, locating compiled code took quite a bit of effort. I probably missed an obvious approach, but in any case, I explain how I pinpointed JIT-generated machine code in a separate post. I verified that compilation happened in a log file:


Compilation events (4 events):

Event: 0.077 Thread 0x00007f03bc0a9000 1 ivan.Code::ifmin (9 bytes)

I was then able to trace the execution of the function and see the actual machine instructions. The relevant section of gdb session looked like this:


=> 0x7fffed060190: sub $0x18,%rsp

=> 0x7fffed060197: mov %rbp,0x10(%rsp)

The instructions above populated registers in preparation for comparison.


=> 0x7fffed06019c: cmp %ecx,%edx

The above instruction compared a with b


=> 0x7fffed06019e: mov %ecx,%eax

And the above moved b to result


=> 0x7fffed0601a0: <b>cmovl</b> %edx,%eax

Next: conditional move if a<b. The logic of this steps is like this: prior to this step, result=b (we just moved b to result). Now, if a<b,result will be overwritten with a. So if a<b, then result=a. Other wise, result remains unchanged, meaning result=b. Just what we wanted!

Next, finalize and return – nothing terribly interesting


=> 0x7fffed0601a3: add $0x10,%rsp

=> 0x7fffed0601a7: pop %rbp

=> 0x7fffed0601a8: test %eax,0xaf96e52(%rip) # 0x7ffff7ff7000

=> 0x7fffed0601ae: retq

Similarly, I confirmed that ternary operator is compiled into conditional move instruction. This Java code:


int a = aarr[1];

int b = barr[2];

res[3] = a<b?a:b;

produced similar machine code sequence, including conditional move:


=> 0x7fffed05fc9c: cmp %ebp,%r11d

The above compares a with b


=> 0x7fffed05fc9f: <b>cmovl</b> %r11d,%ebp

And this lines overwrites one with the other – conditionally!!

Conclusion: for simple conditionals, modern JIT compilers emit efficient branchless code. No additional programming tricks are required. Clever interview trick of computing minimum of 2 numbers through obscure bitwise code should remain confined to interview rooms – it does nothing to improve performance in real applications.

Comparison of concurrency features of Java and C#

As promised in my previous post, here’s a quick comparison of concurrency management techniques in multithreaded application between Java and C#. First thing I’d like to get off my chest is to mention that concurrency facilities offered by the two languages are quite similar. This is not surprising given their history, of course. These facilities are structured exactly the same way and comprise 5 major parts.
First, there is a “default” locking/synchronization option that is safe and easy to use, but is not flexible and imposes a performance penalty, especially in truly concurrent with high level of contention. It is the first concurrency option that beginners learn and it is the best option for normal applications that do not require top performance. It is known as “synchronized” in Java and “lock” in C#.
Second, there are advanced locks that offer an ability to poll for a lock, limit on wait time, etc. These are advanced facilities and their downside is a need to be very careful to manually release each lock in a finally block. Their advantage is an opportunity to achieve several times better performance in highly concurrent applications with high level of contention. In Java, this facility is implemented by classes in java.util.concurrent.locks package, in C# it is System.Threading.Monitor
Third, there is a collection of pre-built primitives utilizing those advanced locks. They simplify coding for some typical locking applications.
Fourth, there is a supplementary mechanism for signalling between threads interested in the same resource.
Fifth, there is a lock-free, wait-free facility based on hardware-optimized (in Java, platform-dependent) Compare-and-swap pattern.
Here’s comparison of specific details of these facilities.

Facility Java C#
Simple “default” locks Implementation synchronized keyword lock keyword
Functionality Identical
Advanced locks Implementation java.util.concurrent.locks package, primarily ReentrantLock and ReentrantReadWriteLock System.Threading.Monitor class
Functionality Disadvantages: “Low observable”: not visible in thread dump, so are more difficult to troubleshoot and debug.Advantages: API is richer, supporting fair locks (guaranteeing lock is given to threads in the order it is requested) and multiple methods to examine queue of threads waiting for the lock). In real life fair locks are rarely used because of severe performance penalty. Advantages: As easy to debug as simple locks, because underlying implementation is exactly the same.Disadvantages: API more limited.
Pre-built primitives Implementation Various classes in java.util.concurrent package Various classes in System.Threading namespace
Functionality APIs are different, although some key concepts match.Comparison of specific primitives offered by Java and C# is a whole topic in itself, and one table cell can not possibly make it justice, so I will not even try to cover it here.
Signalling Implementation wait/notify/notifyAll methods of Object Wait/Pulse/PulseAll methods of Monitor
Functionality Effectively identical.
Lock-free, wait-free Implementation java.util.concurrent.atomic package System.Threading.Interlocked
Functionality Must use an instance of a class from java.util.concurrent.atomic package: you need to define an object as atomic to use this facility, which is quite limiting. In addition, Atomic classes are not directly related to regular non-atomics they represent (e.g. AtomicBoolean is not a java.lang.Boolean and AtomicInteger is not an java.lang.Integer, although it is a java.lang.Number). Any regular numeric type or reference can be used.

Can you see a java.util.concurrent.locks lock in a thread dump?

Before Java version 5, there was only one tool in the multithreading toolbox for protecting critical sections of code with a lock: synchronized methods and code sections. To date, synchronized remains the most reliable and simple method. However, it had a number of shortcomings, so Java SE version 5 introduced a few improvements, including more flexible locks, which are implemented as classes in java.util.locks.concurrent package. Naturally, increased flexibility came with a price. JUCL locks have a number of shortcomings, including “low observability”. Bear with me as I explain.
Regular locks and monitors that JVM creates to implement synchronized methods or sections appear in Java thread dumps, which is great for debugging, especially when JVM hangs. A permanent hang is frequently caused by a deadlock: a condition when thread A is waiting for a lock that is owned by Thread B while Thread B is waiting for a lock that is owned by Thread A. A thread dump clearly shows threads that own locks and threads that are waiting for them, which helps pinpoint the problem.
Here’s a couple of examples of thread dump fragments showing a thread holding a regular (synchronized) lock and another thread waiting to lock the same object:
A fragment of thread dump generated by IBM JVM, with Thread-6 holding a lock and Thread-7 waiting for it. Waiting thread is clearly identified.


1LKPOOLINFO Monitor pool info:
2LKPOOLTOTAL Current total number of monitors: 4
NULL
1LKMONPOOLDUMP Monitor Pool Dump (flat & inflated object-monitors):
2LKMONINUSE sys_mon_t:0x0086A804 infl_mon_t: 0x0086A83C:
3LKMONOBJECT LockTest$LockableResource@0x10139268/0x10139274: Flat locked by "Thread-6" (0x0250F600), entry count 1
3LKWAITERQ Waiting to enter:
3LKWAITER "Thread-7" (0x02652D00)

A fragment of thread dump generated by Sun JVM, with Thread-1 holding a lock on 0x2406eb00 and Thread-0 waiting to lock the same object ID.


"Thread-1" prio=6 tid=0x02be6000 nid=0x1b60 waiting for monitor entry [0x02f7f00
0]
java.lang.Thread.State: BLOCKED (on object monitor)
at LockTest$LockableResource.dosync(LockTest.java:32)
- waiting to lock <0x2406eb00> (a LockTest$LockableResource)
at LockTest$Worker.run(LockTest.java:42)

“Thread-0” prio=6 tid=0x02be5000 nid=0x1f9c waiting on condition [0x02f2f000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at LockTest$LockableResource.dosync(LockTest.java:32)
locked <0x2406eb00> (a LockTest$LockableResource)
at LockTest$Worker.run(LockTest.java:42)

JUCL locks are different from the synchronized locks. They are just regular Java objects, so thread dump pays them no special attention, so these locks do not appear in thread dumps. This is unfortunate, but this behavior is by design. The best you can hope for is indirect evidence of a JUCL lock, and even that is not always available. You can not see a lock in locks/monitors section of a thread dump. Any help will have to come from analyzing call stacks. If you are using IBM JVM version 6, this is all you can go on. Version 6 of Sun JVM gives you one more hint: blocked thread will be in “WAITING (parking)” state. This is not a lot, but if you can not have a full loaf, half a loaf (actually, more like a crumb) is still better than nothing.
To illustrate the point, I rewrote test program to lock a Reentrant lock in one thread, put the thread to sleep and attempt to lock the same lock in another thread.
Using java.util.concurrent.lock.ReentrantLock instead of synchronized changed the look of a thread dump. In the monitors section of thread dump generated by IBM JVM, there was no mention of our JUCL locks. The number of monitors went down from 4 to 2, and the remaining 2 are system objects (JVM shutdown hook and garbage collector). Here’s the call stacks section of the thread dump, which does not show much. The most you will be able to see is that Thread-7 is waiting for a JUCL lock.


3XMTHREADINFO "Thread-6" J9VMThread:0x022F7700, j9thread_t:0x0086B4B4, java/lang/Thread:0x10128248, state:CW, prio=5
3XMTHREADINFO1 (native thread ID:0x1910, native priority:0x5, native policy:UNKNOWN)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at java/lang/Thread.sleep(Native Method)
4XESTACKTRACE at java/lang/Thread.sleep(Thread.java:851)
4XESTACKTRACE at LockTest$LockableResource.trynew(LockTest.java:33)
4XESTACKTRACE at LockTest$Worker.run(LockTest.java:54)
3XMTHREADINFO3 Native callstack:
4XENATIVESTACK KiFastSystemCallRet+0x0 (0x7C90E514 [ntdll+0xe514])
4XENATIVESTACK WaitForSingleObject+0x12 (0x7C802542 [kernel32+0x2542])
4XENATIVESTACK j9thread_sleep_interruptable+0x101 (j9thread.c:1434, 0x7FFA12F1 [J9THR24+0x12f1])
4XENATIVESTACK jclCallThreadSleepInterruptable+0xe1 (threadhelp.c:214, 0x7FC97D31 [jclscar_24+0x37d31])
4XENATIVESTACK java_lang_Thread_sleep+0x61 (jlthr.asm:107, 0x7FC814E1 [jclscar_24+0x214e1])
4XENATIVESTACK javaProtectedThreadProc+0x7d (vmthread.c:1653, 0x7FF2C5CD [j9vm24+0x3c5cd])
4XENATIVESTACK j9sig_protect+0x41 (j9signal.c:144, 0x7FECBFA1 [J9PRT24+0xbfa1])
4XENATIVESTACK javaThreadProc+0x35 (vmthread.c:260, 0x7FF2CDF5 [j9vm24+0x3cdf5])
4XENATIVESTACK thread_wrapper+0xbf (j9thread.c:947, 0x7FFA3F2F [J9THR24+0x3f2f])
4XENATIVESTACK _endthread+0xaa (0x7C34940F [msvcr71+0x940f])
4XENATIVESTACK GetModuleFileNameA+0x1ba (0x7C80B729 [kernel32+0xb729])
NULL
3XMTHREADINFO "Thread-7" J9VMThread:0x023FC400, j9thread_t:0x0086B250, java/lang/Thread:0x101283E0, state:P, prio=5
3XMTHREADINFO1 (native thread ID:0x1B04, native priority:0x5, native policy:UNKNOWN)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at sun/misc/Unsafe.park(Native Method)
4XESTACKTRACE at java/util/concurrent/locks/LockSupport.park(LockSupport.java:173)
4XESTACKTRACE at java/util/concurrent/locks/AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:759)
4XESTACKTRACE at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:790)
4XESTACKTRACE at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1126)
4XESTACKTRACE at java/util/concurrent/locks/ReentrantLock$NonfairSync.lock(ReentrantLock.java:198)
4XESTACKTRACE at java/util/concurrent/locks/ReentrantLock.lock(ReentrantLock.java:274)
4XESTACKTRACE at LockTest$LockableResource.trynew(LockTest.java:31)
4XESTACKTRACE at LockTest$Worker.run(LockTest.java:54)
3XMTHREADINFO3 Native callstack:

Thread dump generated in the same situation using Sun JVM gives a little more additional information: it specifies precisely what object Thread-1 is waiting to lock (0x2406ede0). Unfortunately, you have no way of knowing which thread owns this lock object.


"Thread-1" prio=6 tid=0x02be6000 nid=0x102c waiting on condition [0x02f7f000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x2406ede0> (a java.util.concurrent.locks.Reentr
antLock$NonfairSync)

at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInt
errupt(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(U
nknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown
Source)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(Unknown Sou
rce)
at java.util.concurrent.locks.ReentrantLock.lock(Unknown Source)
at LockTest$LockableResource.trynew(LockTest.java:31)
at LockTest$Worker.run(LockTest.java:54)

“Thread-0” prio=6 tid=0x02be5000 nid=0xcc8 waiting on condition [0x02f2f000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at LockTest$LockableResource.trynew(LockTest.java:33)
at LockTest$Worker.run(LockTest.java:54)

To kick a dead horse, Microsoft’s .NET synchronization facilities are largely similar to Java, but their advanced locks are implemented using the same System.Threading.Monitor as simple locks and do not suffer from low observability during debugging. I will follow up on this post with a comparison of locking facilities for multithreading in Java and C#.

UPDATE 5/1/2011. As promised, I posted a comparison of synchronization facilities in Java and C#.

Hibernate: Does MVC make LAZY a 4-letter word?

Let’s discuss how Hibernate, an ever-popular O/R mapping tool, works for data retrieval in enterprise Java applications. Usually, objects are represented in relational DB as trees. When you attempt to load an object, how far down the branches do you want to go to retrieve information? If you get the answer wrong, your retrieval operation will be inefficient. Let’s consider a Customer object that has subordinate Address objects and Account objects (in a real application, it may have dozens more, but for the purpose of this discussion these two will do). Your application requests to load specific Customer. Should Hibernate load Accounts too? If it does not, and you later attempt to access an Account, you are in for a big disappointment, because the Account is unavailable. You probably will receive a NullPointerException. I’m oversimplifying situation and considering a naïve O/R mapper here. Real O/R mappers, such as Hibernate, have developed solutions to deal with this conundrum, but it is my intention to introduce the problem first and explain how O/R mapper approaches it next.

If Hibernate misses your intent in the opposite way and loads Accounts you do not need, it is not very good either. You simply can not load all Accounts and Addresses every single time. It might trigger several SELECT queries for different tables, while you only need data from one – as clear example of waste as it gets. Even if there is only one SQL SELECT query, the database cost of retrieving all this information will be quite high as RDBMS will traverse multiple tables to satisfy all inner joins. Your application will also incur the cost of unmarshalling all the objects you never needed from wire format into Java. Heap memory will be allocated for all the subordinate objects you do not need. And you may not even be able to load the entire object tree at all, depending on the way your database is structured.

As you see, it’s pretty important to know which subordinate objects your application needs loaded. Hibernate addresses this by providing a number of fetching strategies for object associations. Default behavior provided by Hibernate for collection-valued associations is lazy initialization: a collection is fetched when the application invokes an operation upon that collection. In our scenario, if you only want to access the Customer object, lazy initialization does a nice job and protects you from “overloading”: no unwanted associations will be fetched. But what about the other case, when you do want to load the accounts? Hibernate provides several options that help you tune the access, including regular lazy initialization with select fetching, resulting in one SELECT for all accounts, and super-lazy initialization suitable for large collections.

Now, let’s consider an MVC or n-tier application. In an application of this kind, data retrieval operation is separated from code that utilizes data by architectural layering or even physically (in a different JVM). How does this change the way Hibernate’s Lazy Initialization works? Quite a bit.

Problem is, you need to access an Account for Hibernate to trigger a SELECT. If your application is multi-tiered or an MVC, it usually accesses and processes data, such as accounts in a totally different layer then it fetches data. A data access method returns a Customer object loaded from database. Then a different method, maybe a JSP in View phase, accesses the Customer object and tries to navigate to accounts. You’d need to keep database connection open so that your code could trigger additional DB query for an Account(s) AFTER Customer object has been retrieved and returned from the data layer. When applied in MVC application, this pattern is known as Open Session in View, because Hibernate Session remains open even when control has returned to the View layer. I personally find Open Session In View unworkable on too many levels. Conceptually, it makes a joke of layering, as data access is now happening all over your code. DB connection and transaction management becomes more difficult, because you can not close connection at the end of data access method. Now you need a framework for closing connection reliably with some kind of “request finalizer”, such as a Servlet filter (check out an example in this article to see how ugly it is). From often overlooked performance perspective, this dramatically extends the length of time the application holds on to connections, requiring much larger connection pool.

If application tiers are physically separated, Open Session in View becomes impossible altogether (for obvious reasons, JDBC connections and Hibernate Sessions are not serializable/transportable to another JVM).

We see that lazy initialization by itself is not enough for MVC/n-tier application. Because of layer separation, you can not defer association data retrieval “until it is needed”. You must know going into the data layer how much data you need (which associations to retrieve). So you make your data retrieval method in the data layer “need-aware”. Your data retrieval method for the Customer object in Controller (or data layer) will accept additional parameters indicating whether you also need Accounts and Addresses. A numbers game that happens next is very important. Does your application need a lot of different sets of associations? If yes, you will quickly realize that individual methods per association (e.g. getCustomer, getCustomerWithAccounts, getCustomerWithAddresses) are not sustainable. You can not keep adding methods like getCustomerWithAccountsAndAddresses, since the number of methods grows exponentially, doubling with every new association. Six associations is a low number for a typical real-life object and it still gives you a whooping 64 methods, which is impractical. If you only need 2-3 combinations (such as customer alone, customer with accounts and customer with all associations), you are lucky and this article does not apply to your situation. But if you do need all those multiple permutations of associations, please read on. An obvious solution to this puzzle is a single load method with dynamic choice of associations to retrieve (we can dub it getCustomerWithAnyAssociations). The method will take an additional argument: an association filter object comprising a collection of flags, one per association. If you want a particular association to be fetched, you set corresponding flag.

An interesting corollary to this design is that it enables defensive programming of access to associations. Since you make explicit decision on whether to retrieve each association, you can remember these decisions in a form of flags and use them to protect access to business methods that would need data in the association.
if (customer.accountsAvailable() {
// do something that needs accounts
}

Next, let’s drill down into the body of the data retrieval method. For Lazy Initialization to trigger a load of an association, you need to access the association. It is possible to do this in a simplistic way: if you need an Account, your code will “touch” Accounts by calling, e.g. customer.getAccounts().size(), which will trigger loading of the Accounts. This approach is workable and I’ve seen it used in the real world. However, this approach has an obvious flaw: the entire design is built on calling a “touch” method solely for its side effect (you don’t really need Accounts at that point and would be happy if accounts just sat there in the Customer object, but you need to call this method to force Hibernate to load the association). Relying on side effects for core functionality is not a good design approach as it may be difficult to understand and easy to break in maintenance.

How can one improve on the “touch” approach? Give up on the idea of transparent persistence and think of Hibernate as a data service that performs explicitly modeled CRUD operations and design your data loading methods the way you would design data services that retrieve data from a remote location. Thinking of the database as a separate and distinct tier of your infrastructure will give you better insight into performance implications of your design.

So you construct Hibernate query dynamically explicitly specifying which associations to fetch. It is usually fairly easy to accomplish with either HQL or Criteria query. But look around – with this move, Lazy Initialization has disappeared from your design. At most, you are using it to avoid automatically loading any associations when you only need the base object. But you do not allow Hibernate to load associations on access. MVC/n-tier architecture made it difficult to use Lazy Initialization in exactly the situation that is was intended for: a large number of associations and widely varying need to load.

Checked or unchecked?

Exception handling model in Java requires that all checked Exceptions (deriving from java.lang.Exception) are either caught or declared in the throws clause of the method where it is thrown. Compiler checks that this rule is followed and generates a compile-time error when it is not. Exceptions deriving from java.lang.RuntimeException are called unchecked since they do not have to be declared and compiler does not check for them.

The checked/unchecked dichotomy has been around since the invention of Java. Then, late in 1990-s Microsoft decided it liked Java, but wanted to own language. So they created C#, which was very much like Java, but with some improvements. Not surprisingly and in a typical Microsoft way, they did away with checked exceptions altogether. In C#, all exceptions are unchecked. Method signature says nothing about exceptions that can be thrown by the method. There is no throws clause, so there is no way to describe which exceptions a method may throw. At the first glance, it makes things easier: developers do not need to write code to deal with exceptions. This is also the problem: developers have no way of knowing which exceptions may be thrown by a method. Even if they have access to the source code of the method, it is still not enough: one needs access to the source code for all nested calls.

Since the release of C#, a vocal minority in the Java world has been advocating for adopting the same approach: doing away with checked Exceptions. While this position is rightly considered extreme, the choice between using checked and unchecked exceptions needs to be made every single time you create your own Exception class or write code throwing an Exception.

The debate about the proper place to draw the line between checked and unchecked has been ongoing in the Java community. I can not presume to cover the entire Java universe, but I’d offer my perspective as a Web and enterprise middleware architect and developer. While wholesale abolition of checked Exceptions is out of question, the boundary between the 2 classes and their appropriate usage is decided for each project. Let’s consider some approaches to drawing this boundary.

In a typical Web application, every Web requests requires access to some kind of backend (database, mainframe, etc). External resource access is by its nature “exposed to elements” over which our application has no control (a network outage, data center flooding, partner application down, database server overheating, etc). That’s why Java methods that implement external access always throw a checked exception, such as IOException or SQLException. In a large and stable application, exception processing is generalized in an exception handling framework, which may be responsible for the following:

  1. insulate end user from technical details of unexpected errors that are of no benefit to him, and facilitate presentment of a clean notification of an internal problem
  2. enable quick detection and remediation of infrastructure failures (out of disk space, lost database connection)
  3. collect and preserve exception information to aid troubleshooting and debugging
  4. record failed transaction outcome for traceability/auditing

With this in mind, I will explore 2 very different approaches to deciding on checked vs unchecked exceptions. First, let’s turn our attention to this Elliote Rusty Harold’s article. He suggests that we should start calling checked exceptions “external exceptions” and runtime exceptions “internal exceptions”.

If the exception is caused by problems internal to the program, it’s a runtime exception. If it’s caused by problems external to the program, it’s a checked exception.
Short, simple, to the point. Are there any exceptions to this rule? Are there any cases where an internal program bug legitimately should be signaled by a checked exception or a problem in the external environment should be signaled by a runtime exception? I can’t think of any.

I think this is a wrong way to draw the line, at least in a typical Web or middleware application.

It is true that in most situations where an external system access occurs, a checked Exception is usually thrown – most frequently an IOException or a SQLException. Let’s explore a bit more. What would cause a SQLException? Usually a program bug (unsatisfied constraint, failed to check for maximum field length, wrong table name) or an infrastructure problem (connection lost, out of disk space). Theoretically you could also use SQLException to signal a business condition (record with this ID already exists), but this is widely recognized anti-pattern and I hope you are not following it. So we are left with situations that do not allow successful completion of processing (program bug or infrastructure failure).

What do you do in response to a checked Exception in this situation? In my experience, individualized exception handling logic specific to a method is highly unusual. You can not fix a program bug programmatically and you normally can not work around an infrastructure failure or misconfiguration. To wit: if a checking account balance request returns a RemoteException, there is nothing you can do in your Web application to recover from this situation and still show account balance to the end user. You can only say “Sorry” in a nice way as prescribed by your application coding standards. Same if you try to retrieve order shipment details and SQLException comes back. And so on.

As a sidebar, once we’ve established that exception processing is generalized and boilerplate, we could call it a cross-cutting concern, and engage in some Aspect-Oriented Programming.

A typical generic exception handling code implementation that I’ve seen wraps original Exception in MyCompanyException and propagates the latter all the way down the call stack without examination or processing in intermediate layers. This pattern is widely accepted and generally makes sense, but it is not perfect. One thing that immediately jumps out at you is that almost every method in your application now throws MyCompanyException. This is so because Web and middleware applications exist to process data stored in different kinds of backend systems and methods for accessing those systems invariably have a chance to fail, so they throw a checked Exception (e.g. SQLException for databases). The typical approach described above propagates these Exceptions back in the form of checked MyCompanyException. This presents a glaring problem: signal overload. When everyone is a suspect, you do not really have one. If every method throws the same checked Exception, the “throws” clause adds no value, only overhead. You might as well close the book on checked Exceptions and go home – make MyCompanyException unchecked.

What can a developer do in this situation? Why would I want every programmer to worry about infrastructure problems, even at a trivial level (inserting standard calls for exception processing)? Most programmers should focus on implementing business logic. Java community built numerous frameworks that externalize “technical” concerns, remove the need for repetitive code and free the programmers to focus on the business problem at hand. This is one of the reasons Java development is productive enough to be a market success.

I should mention that Elliotte Harold, the author of the blog post I’ve discussed above, has become kind of an “unchecked exceptions skeptic”. He recently published an article on IBM developerWorks devoted specifically to avoiding throwing a RuntimeException. He goes to a great length and writes quite a bit of code to achieve that. He even implements his own bubble sort! Here’s what I don’t like about that. First, the more code one writes, the higher a chance of a bug. Second, when someone reimplements system functionality, there is an added risk of creating an inferior solution. This adds up to a steep price, so the author must believe unchecked Exceptions to be a very dangerous thing to justify this high price. I am among of those who disagree and find unchecked Exceptions a useful tool. In fact, I advocate a wider use of unchecked Exceptions in the domain of Java I’m familiar with.

For comparison, let’s look at what Tata’s ShriKant Vashishtha advocated a few years back in this artcile:

Exceptions for which the end user cannot take any useful action should be made unchecked. For example, exceptions that are fatal and unrecoverable should be made unchecked. There is no point in making XMLParseException (thrown while parsing an XML file) checked, as the only action to be taken may be to fix the root cause based on the exception trace.

This is the crux of his article. He devotes the rest of it to elaborating how this pattern would work with Struts.
I find this approach more reasonable. The difference between checked and unchecked is that developer is forces to “do something” about checked Exceptions. If a developer can not do something meaningful, this “doing something” becomes nothing but waste.

Combining ShriKant’s idea with an observation that most external access Exceptions switch the code execution over to the “sorry” path, I suggest an expanded usage of unchecked exceptions. Unless there are individualized exception processing / recovery possibilities, take checked Exception thrown by the code accessing an external system and quickly turn it into a RuntimeException, right at the point where original Exception is realized. This would eliminate useless and distracting “throws MyCompanyException” clauses all over your code. You will then need to remember to catch RuntimeException in the outer layer of your code (in Struts, a Front Controller) and decide what to do with it. In an extreme case, you can even ignore it and let container handle it. Web Containers catch all Runtime Exceptions and give developers some limited configuration options for dealing with those (error page). Of course, true business exceptions (such as failed input validation) would be exempted from this policy and would remain checked Exceptions, forcing developers to handle them explicitly.

You may ask: what about monitoring/infrastructure related goals of Exception handling framework (“quick detection and remediation of infrastructure failures” I mentioned above)? Since this is a cross-cutting concern that applies universally to all applications running in a particular server environment, it would be better to handle error detection and notification by server infrastructure independently from application code. This would free programmers from having to worry about it and increase reliability of the code (detection/notification happens automatically and does not depend on programmer remembering to insert right code). It is certainly possible. For example, when using IBM WebSphere, one would leverage FFDC for error detection and build a solution on top of that.

UPDATE 11/8 in response to Elliotte’s comment below.
Key point that Elliotte makes is that local error handler is preferable to a global error handler because it has context: information about specifics of the situation that resulted in an error. He also addresses the point that global error handler would often find itself in “nothing I can do here” position when trying to handle an error. But I was trying a completely different point: that local error handler is often times in no position to take a meaningful action, despite having the local context. Thus responsibility for handling the error may be safely transferred to a global handler. Even if you know precisely what you’re trying to do, most errors will break the “happy path” irreparably. In most cases, after an error, the difference may be in the tone of “Sorry” you say, but you will never be able to answer the original question (client request). Situations where a retry is reasonable excluded, obviously.

This point makes sense if you assume a separation of responsibilities that makes application code focus on the task at hand and leave infrastructure concerns to “others”. Please stay with me for an explanation.

It doesn’t know … whether the sys admin needs to be woken up at 3:00 AM in the morning or not

No, no. I don’t want local error handler to figure whether the error is so truly fatal to the functioning of the whole application that it requires an immediate attention. And don’t get me wrong – I don’t want global error handler to be responsible for that, either. I alluded to my preferred answer to this problem in the last paragraph of the original post – externalize this logic. To ensure reliable run of your application, you should monitor it. Monitoring would include intercepting and collecting exception information. Your monitoring software will then be configured to react appropriately to situations you are interested in. You achieve a separation of concerns: Java code in your application is concerned with solving the business problem at hand, while monitoring configuration defines reactions to system errors (including sending pages to sys admin staff).

Getters and setters in Java

A simple java bean with public setters and getters is arguably the simplest, most familiar and most ingrained pattern in Java. This is one thing many Java students write very early and remember forever:


    private int foo;
    public int getFoo() {
        return foo;
    }
    public void setFoo(int foo) {
        this.foo = foo;
    }

When C# came along, there was a brief discussion in the Java community about the virtues of “properties” as opposed to plain private variables with getters/setters (known scientifically as accessors and mutators). But it did not lead anywhere and by now, the use of the getter/setter pattern has become automatic. Some GUI IDEs offer support for generation of getter/setter methods. But is it a good thing?

Originally, getter/setter pattern was created to promote good object-oriented design by encapsulating the internals of a class from its external interface. This permits changing or refactoring internal details of the class implementation without impacting class users.

But is there any reason we generate getters/setters for value objects? The pattern of private field with public getter/setter is so ingrained in the minds of the Java programmers, we never allow ourselves question it. And we keep on happily producing more classes with usual trivial getter/setter pairs, which add no value, bloat code and make assignments more difficult to understand. In many cases there is no encapsulation, as all consuming classes are meant to change and MUST change whenever value object definition changes. But we have the benefit of 2 useless methods generated per simple field and replace simple assignment with a convoluted nested set/get calls. Ask yourself, which is easier to read:
customerBean.ssn = customerVO.ssn;
or
customerBean.setSsn(customerVO.getSsn());

I think Java community would be well served by cutting down on the getters and setters.

JAXB
Still, in some cases creating getter-setter pair is necessary. One case where I grudgingly agree with using getters/setters is Java classes generated by JAXB compiler, xjc. In most cases, JAXB generated getters and setters are useless: they are of trivial variety (simply assign or return corresponding field value), but they are needed for API consistency. That’s because some of the getters/setters in JAXB are non-trivial. I can point to two such cases: optional attribute with default value and primitive lists.
Optional Attributes
This XSD


    <xsd:attribute name="Pos" type="xsd:int" use="optional" default="1"/>

will result in this non-trivial generated by xjc:


    @XmlAttribute(name = "Pos")
    protected Integer pos;

    /**
     * Gets the value of the pos property.
     * 
     * @return
     *     possible object is
     *     {@link Integer }
     *     
     */
    public int getPos() {
        if (pos == null) {
            return  1;
        } else {
            return pos;
        }
    }

Here, default value (1) is returned by the getter method when underlying Integer value is null (not assigned). You may suggest circumventing the need for the getter by initializing the Integer to the default value at the moment of declaration (protected int pos=1;, this also has side benefit of being able to replace wrapper class with primitive, because we do not use null any more). Then, if value is never assigned, the default will be returned. But this only works if the object is used once. Any reuse of the object breaks this pattern, since there is no means to unset the value of “pos” property in this code. The only way to enable the object to be used more then once without touching the getter is to put equivalent code inside the setter. Conclusion: optional attribute with default value requires non-trivial code in the getter or setter.
Lists of integers
This case is a bit more curious. Suppose you have this schema:


    <xsd:element name="RoutedGreetings">
	<xsd:complexType>
	<xsd:sequence>
		<xsd:element name="GreetingsNumbers" type="NumberListType"/>
	</xsd:sequence>
    </xsd:complexType>
    </xsd:element>

    <xsd:simpleType name="NumberListType">
      <xsd:list itemType="xsd:int"/>
    </xsd:simpleType> 

It shows a list of integers used in another element. Xjc will lazily initialize the list on first access in the getter:


    @XmlList
    @XmlElement(name = "GreetingsNumbers", type = Integer.class)
    protected List greetingsNumbers;

    public List getGreetingsNumbers() {
        if (greetingsNumbers == null) {
            greetingsNumbers = new ArrayList();
        }
        return this.greetingsNumbers;
    }

This is interesting, because I struggle to see why the list should be initialized lazily. If it was initialized eagerly, with the variable declaration ( protected List greetingsNumbers = new ArrayList();), one would be able to do away with getter. I’d be glad to hear thoughts from readers.
Note also that there is no setter: the getter returns a reference to live list, so to set the value you first perform a get and then manipulate the list you obtained.