Saturday, July 22, 2006

Hard Core Java: ThreadLocal

Update: As I've recently discovered, the array-based pattern at the end of this entry not only improves performance, but it can also help you work around a ThreadLocal bug present in JDK 1.4 and 1.5.

This is the first in what I hope will turn into a series of entries on the core Java libraries and language.

(Digg this.)

Use a thread local variable (where each thread has a separate value) when you want to carry a value along without explicitly passing it from method to method down the call stack, and when appropriate, remember to explicitly clear the value in a finally block to prevent memory leaks:

  ThreadLocal<Context> threadLocal =
    new ThreadLocal<Context>();

  void doSomethingInContext(Context c) {
    threadLocal.set(c);
    try {
      doSomething();
    }
    finally {
      threadLocal.remove();
    }
  }

Consider whether or not the code which sets the thread local value can be reentered. I often use thread locals purely to test for reentance. If you prohibit reentrance, fail early:

  ThreadLocal<Context> threadLocal =
    new ThreadLocal<Context>();

  void doSomethingInContext(Context c) {
    if (threadLocal.get() != null)
      throw new IllegalStateException();
    threadLocal.set(c);
    try {
      doSomething();
    }
    finally {
      threadLocal.remove();
    }
  }

If you do allow reentrance, you may want to save and restore the existing value. To support multiple reentrance, you need to keep a stack of previous values. Rather than use an explicit stack data structure, we can utilize the thread stack and save some code and overhead:

  ThreadLocal<Context> threadLocal =
    new ThreadLocal<Context>();

  void doSomethingInContext(Context c) {
    Context previous = threadLocal.get();
    threadLocal.set(c);
    try {
      doSomething();
    }
    finally {
      threadLocal.set(previous);
    }
  }

Thread local access isn't expensive per se, but it's also not so cheap that we want to perform unnecessary lookups in frequently executed code. The following code performs three thread local lookups for the initial call and one lookup for reentrant calls:

  ThreadLocal<Context> threadLocal =
    new ThreadLocal<Context>();

  void doSomethingInContext() {
1.  Context c = threadLocal.get(); 
    if (c == null) {
      c = new Context()
2.    threadLocal.set(c);
      try {
        doSomething(c);
      }
      finally {
3.      threadLocal.remove();
      }
    } else {
      doSomething(c);
    }
  }

Notice we clean up the Context instance when we create it but not when it already exists?

By adding a flag to Context and overridding ThreadLocal.initialValue(), we can save one lookup and still ensure proper cleanup. The flag tells us whether or not the current invocation is responsible for the cleanup:

  ThreadLocal<Context> threadLocal =
    new ThreadLocal<Context>() {
      protected Context initialValue() {
        return new Context();
      }
    };

  void doSomethingInContext() {
    Context c = threadLocal.get();
    if (c.isVirgin()) {
      c.loseVirginity();
      try {
        doSomething(c);
      }
      finally {
        threadLocal.remove();
      }
    } else {
      doSomething(c);
    }
  }

  class Context {
    boolean virgin = true; // "new" is taken.
    public boolean isVirgin() {
      return virgin;
    }
    public void loseVirginity() {
      this.virgin = false;
    }
  }

We can still do better. If we store a wrapper object instead of a direct reference to the Context, we can reduce the number of thread local lookups in the initial invocation to one, one third of those in our original example. We'll use a single element array as our wrapper object to save us from having to write another class:

  ThreadLocal<Context[]> threadLocal =
    new ThreadLocal<Context[]>() {
      protected Context[] initialValue() {
        return new Context[1];
      }
    };

  void doSomethingInContext() {
    Context[] c = threadLocal.get();
    if (c[0] == null) {
      try {
        c[0] = new Context();
        doSomething(c[0]);
      }
      finally {
        c[0] = null;
      }
    } else {
      doSomething(c[0]);
    }
  }

If we don't want our ThreadLocal instance to prevent the garbage collection of the Context class, we should use an Object[] instead of a Context[]. We might want to do this if library code (in the system classpath perhaps) references our thread local variable and a child class loader loads Context.

Code which depends directly on a ThreadLocal can be difficult to test (on par with code which depends directly on a static variable). As an alternative, follow dependency injection patterns and inject thread local values into your code.

4 Comments:

Blogger pulihora said...

further
o. incase doSomething() starts new threads, and these threads also need to access context,
we can make use of InheritableThreadLocal (http://java.sun.com/j2se/1.5.0/docs/api/java/lang/InheritableThreadLocal.html).

o. Wrapping ThreadLocals in a singleton class helps make it accessable from other classes also.

...
try {
   x.doSomething();
}
...
}
in above code if some x need to access Context we have a problem.

We can use singleton like below; singletons are not really evil.

public class ContextHolder {
  private static ContextHolder holder = new ConetxtHolder();

  private ThreadLocal <ontext> threadLocal = new ThreadLocal<Context>();
  public Context getCurrentContext() {
     return threadLocal.get();
   }
   public Context setCurrentContext(Context) {
      return threadLocal.set(c);
   }

   public static ContextHolder getHolder() {
      return holder;
   }
}

o. Method/(rmi, corba) protocol interceptors, Command executors tent to be good places to set ThreadLocal context info.

10:25 AM  
Blogger Bob said...

This is only true if your thread local value strongly references your ThreadLocal instance, and it applies whether your thread local variable is a static or instance variable. It's analogous to a WeakHashMap value strongly referencing its key. The ThreadLocal instance is essentially a key in a WeakHashMap (though not literally).

5:58 PM  
Blogger rpbarbati said...

I would rather each thread have instances of the classes it uses, and use private data members in those classes rather than use ThreadLocal, avoiding pretty much all the extra code and the GC issues, as well as making the whole much easier to debug.

But that's just me...

12:22 PM  
Anonymous Anonymous said...

@rocketrpb
Start with understanding the need of sharing data between multiple threads, and you will have an idea, what's being talked about here.

11:40 PM  

Post a Comment

<< Home