Sunday, December 10, 2006

Caching Class-related information...

Jacob is exploring caching information related to Java Class instances.

Say for example we need collections of getter methods in classes. Rather than filter out the getter methods every time we need them, we can build the collection of getters once and cache the result in a static map:

  /** Maps a class to getter methods in that class. */
  static Map<Class<?>, Collection<Method>> getterCache = ...;

Problems arise when the garbage collector needs to reclaim a Class instance to which our map holds a strong reference. This could happen if our cache class is in the system class path and a class in our cache is from a web application.

When we go to reload the web application, our cache class keeps a strong reference to the web application class which keeps a strong reference to its class loader which keeps strong references to all of the other web application classes--memory leak.

If you can, store the cache somewhere where it will free up at the same time as the classes it caches. For example, in the case of a web application, we might store the cache in the servlet context.

If you must keep strong references from a static scope, you have to keep a WeakReference to the Class instance making it available for garbage collection. Jacob pointed out that we need to keep a WeakReference to the cache key, but often times your value also has a strong reference to the Class instance.

In our getter example, our value is a list of methods. Those methods each keep a strong reference to their class--memory leak. How do we fix this? We can't keep a weak reference to the value itself in this case. In between usages of our cached data, we'll hold the only reference. We could wrap each Method instance in a weak reference, but I find keeping a SoftReference to the cached value works just fine. Plus, the garbage collector can reclaim values which we don't access often.

How do we implement this cache? It just so happens I've created a class ReferenceCache which solves this exact problem. ReferenceCache wraps ConcurrentHashMap and references keys and values however you like, strongly, softly or weakly.

Compared to using a WeakHashMap, ReferenceCache abstracts more of the caching logic so you don't have to write code to check for a value, create a value, and put the value in the cache over and over. You only write code to create the value.

Where WeakHashMap uses equality to compare keys, ReferenceCache compares weakly and softly referenced objects by identity; weak and soft references are inherently identity based.

WeakHashMap cleans up after garbage collected keys in the same code path as your map operations. Thanks to the ConcurrentHashMap behind the scenes, ReferenceCache can perform these cleanups in a background thread.

What's more, ReferenceCache creates values canonically, i.e. it will only create one value per key at a time. This comes in handy pretty much any time you need a canonical map and prevents extraneous value creations other times.

Getting back to our getter example, a ReferenceCache-based implementation goes something like this:

  static Map<Class<?>, Collection<Method>> getterCache =
      new ReferenceCache<Class<?>, Collection<Method>>(
          WEAK, SOFT) {
    protected Collection<Method> create(Class<?> clazz) {
      Collection<Method> getters = new ArrayList<Method>();
      for (Method m : clazz.getMethods())
        if (isGetter(m))
      return getters;

  static boolean isGetter(Method m) {
    String name = m.getName();
    return name.length() > 3
        && name.startsWith("get")
        && Character.isUpperCase(name.charAt(3))
        && m.getParameterTypes().length == 0
        && !Modifier.isStatic(m.getModifiers())
        && ...;

Now, what do we do about Super Type Tokens? ;)

P.S. Any time you cache objects, run performance tests to determine whether your cache really does more good than harm.


Blogger Jacob Hookom said...

if you want to retain a strong reference to the cached getters, wouldn't this cause a memory leak since methods have a strong reference back to their declaring Class, mooting the weak referenced key? I guess this is why I think class information should be cached at a much higher level since we know that when we dive into Class meta information, attempting to weakly reference the Class instance will be negated by anything you store?

6:22 AM  
Blogger Bob said...

Yeah, from my experience, your values will almost always contain a reference to your class. That's why I said, "in our getter example, our value is a list of methods. Those methods each keep a strong reference to their class--memory leak."

Then I suggested you either cache the information closer to the classes you're caching (I assume this is what you mean by "higher level"), keep weak references to the Method instances, or keep a soft reference to the value (my preference).

9:10 AM  
Blogger Jacob Hookom said...

thanks for your insight on this, it'd be nice if the SE had something similar, less the lazy creation logic-- which is nice, btw

9:23 AM  
Blogger Jacob Hookom said...

sorry for the misunderstanding too with my original question, what I was shooting for was a way to retain strong references to meta information per classloader (outside of your soft reference solution), but it doesn't sound like that's possible.

9:29 AM  
Blogger Bob said...

No problem. I'm hoping to get this into Java 7, lazy creation logic and all. ;)

As for retaining strong references per ClassLoader, I wrote some code awhile back which hacked a static collection class into each class loader. I wouldn't recommend that though.

I wish ClassLoader has a method like: reference(Object), where it would keep a strong reference to any object you passed it. You could pass the ClassLoader the getter collection in our example, a super type token, etc., and then keep a weak reference yourself.

9:37 AM  
Blogger Unknown said...

Or a ClassLoaderLocal...

4:04 PM  
Blogger Bob said...

Thanks for the link!

5:05 PM  

Post a Comment

<< Home