Author: Jason Zhang (Fengming Zhang)
Java Cache can dramatically improve the performance of Java/J2EE/Portal applications and achieve high end user satisfaction. In this paper, we discuss and analyze the key elements of an enterprise level Java cache framework. And finally we propose a way to evaluate Java cache frameworks when consider including them into application architecture design.
Since a web application is accessed by many concurrent users so how to increase the performance is always a challenge for web application architects. In a multi-tiered application, data access is an expensive operation compared to other tasks. Object caching is a technology that can keep frequently accessed data in the memory to avoid the cost and time required for the data’s frequent reacquisition and release. To share same frequently accessed objects so that server heap is utilized efficiently rather than creating similar objects for every request and hogging the heap space. That also means caching frequently accessed objects saves expensive CPU time that would otherwise have been spent on re-creating objects again and again for every request, and reduces the very time consuming frequent connections to database and other data sources.
But object caching does not mean that you cache all frequently accessed data. You have to consider about the available heap size for your application that it would require for its normal business operations that the application has to perform. Just because you want to reduce network call does not mean that you would cache all frequently accessed data. Also it makes no sense to cache rarely accessed objects and take precious memory away from JVM. Your cache should contain only a manageable number of frequently used objects that probably improve the performance of your application unless memory space is not a constraint (but normally it is not true).
Why we need cache framework instead of directly using Java Hashmap, Hashtable, JNDI (Java naming and directory interface) and even EJB (Enterprise Javabeans) etc. These methods provide a way to keep the Java objects in memory and perform the object lookup based on a key. But none of these methods provide any mechanism for either removing the object from memory when it is no longer needed or automatically creating the object when it is accessed after expiration. The HTTPsession object can also allow objects to be cached, but lacks the concept of sharing.
In 2001, A JCP (Java Community Process) specification for Java Cache-JSR-107 (JCACHE – Java Temporary Caching API) was released. This specification specifies API and semantics for temporary, in memory caching of Java objects, including object creation, shared access, spooling, invalidation, and consistency across JVMs. Though JSR-107 was updated again in 2005, since it is still under draft version, not so many existing frameworks are seriously complaint with it  .
2. DATA TO BE CACHED
Typical uses of caching include: (1) HTML pages; (2) Database query results; (3) Any information that can be kept in the form of Java object: from database, JMS, JNDI, EJB, RMI, Web Services and other channels. In this paper, we are mainly focused on the third type of data format, that is, any data from any data source that are kept in the form of Java object.
For Java object itself, we still need to define what kind of data should be kept in the cache. A server must manage information and executable objects that fall into three basic categories: objects that never changes, objects that are different with every request, and everything in between. Java is well equipped to handle the first two cases but offers little help for the third. If the object never changes, we create a static object when the server is initialized, for example use Hashmap to keep the key and value. If the object is unique to every request, we create a new object each time. For everything in between, objects or information that are shared across requests, between users or between processes, we can consider to use a cache layer to keep them .
3. JAVA CACHE KEY ELEMENTS
Whether we need to design a new Java cache component or select a Java cache framework from a long list of open source or commercial products, we need to be very clear about the factors that will help us make our final decision. In addition, we also need to understand that different Java/J2EE projects have different features and one Java cache framework working well for one project does not mean that it is also working fine for another one. Even for similar projects, the same Java cache framework should also be configured or even customized to meet thoese special requirements. So understanding the Java cache inner architecture and its key elements would benefit architecture design.
3.1 Objectives of Java Caching Framework
Below are a general summary of the basic considerations when choosing a Java cache framework:
- Faster access to cached data.
- Grouping of similar object types in the cache. The framework should invalidate a collection of objects with a single operation. It should be possible to associate objects in the cache so they can be managed as a group.
- Object references should be also cached.
- Configurable cache management so we can modify cache parameters declaratively rather than programmatically.
- Easy integration into any web application with minimal or no changes in the Web application itself.
- Support multiple expiration mechanism (or eviction policy). The application can run out of memory due to not releasing the unused data from the cache at regular intervals.
- A flexible and extensible framework so we can switch to any third-party caching API in the future.
- Flexibility to configure one or more data storage options, such as memory cache, disk cache, or caching the data on a remote machine.
- Ability to manage objects loaded from any source. The original source of the data being cached should have no restrictions.
- Data Synchronization issues and data inconsistency problem can arise because of inconsistency between stale cached data and original source of data. If original data is updated, cached data should be intimated about the change and update the cached data. This feature of intimating and updating the cached data comes with a price of putting framework in between the original data and the cached data. And the price is high if it is on a distributed environment.
- In an enterprise environment, clustering and data replication feature is necessary. The cache framework should be able to seamless integrated with clustering environment without special code change or customization. The replication between clustering node should be real-time and accurate.
- It should be transaction safe. The cache framework should support the general locking mechanism to ensure data consistency.
From above list, we can see Java cache framework involves a lot of technologies, and it becomes much more complex if it needs to meet enterprise level requirement. Since it is bit hard for us to cover them all in this paper, for the rest of this section, we will discuss a few key features that always need to be especially considered when Java cache framework is expected to be adopted in a Java/J2EE or Java Portal Application.
- Object Caching Data Structure
- Object Eviction Algorithms
- Cache Loader
- Clustering and Distributed Cache
3.2 Java Object Caching Data Structure
3.2.1 Cache and HashMap
Many would say that Maps are a starting point of a cache (an argument used by the JSR-107 JCACHE expert group to make Javax.cache.Cache extend Map, in fact). But while maps are great for storing simple key/value pairs, they fall short on a lot of other features that may be necessary in a cache, such as memory management (eviction), passivation and persistence, and a finer grained locking model (a HashMap, for one thing, is not thread-safe at all. And current HashMaps use a coarse level of locking that will not allow non-blocking readers or even multiple readers). So, while a Map is a good starting point, if anyone feels they need to implement or manage any of the above features themselves then they probably should be using a cache and not a Map.
Different cache frameworks have different data structure; below I will take JBoss as an example to introduce its data structure, which can represent the typical design principles of the most of the existing open source frameworks.
3.2.2 JBoss Cache Data Structure
JBoss has two types of cache frameworks: Plain cache (or core cache) and Pojo Cache.
Plain Cache stores object references directly, its function just like a Hashmap API. Its limitation is: (1) each object must be serialized; (2) Coarse grained: replicate a big object even only a sub-field is dirty it still requires the user to trigger the replication.
POJO Cache (Plain Old Java Object) is fine-grained and update is per field basis but can have batch update as well. Support full object graph relationship during replication and persistency (circular, multiple references and one object multiple keys).
1. Plain Cache mechanism The Plain Cache simply stores whatever you give it, in a tree-like data structure. Key/value pairs are stored in tree nodes, and are serialized for replication or persistence.Plain Cache consists of a collection of node instances, organized in a tree structure. Each node contains a Map which holds the data objects to be cached. Each node has one and only one parent, and the root node is denoted by the constant fully qualified name, Fqn.ROOT.
Figure 1 Data structured as a tree
In the diagram above, each box represents a JVM. You see two caches in separate JVMs, replicating data to each other. Any modifications in one cache instance will be replicated to the other cache. Naturally, you can have more than two caches in a cluster.
2. POJO Cache Mechanism
POJO Cache is built on top of Plain Cache but uses a more sophisticated mechanism of introspecting user classes by bytecode weaving, adding listeners to fields such that the cache is notified when fields are modified. For example, storing a large, complex object in POJO Cache will result in POJO Cache introspecting the object bytecode, and only storing the object’s field primitives in the tree structure. And whenever the object’s fields are changed, only those fields are replicated, allowing us to achieve very efficient fine-grained replication.
POJO Cache internally uses the JBoss Aop framework to both intercept object field access, and to provide an internal interceptor stack for centralizing common behavior (e.g. locking, transactions).
The following figure is a simple overview of the POJO Cache architecture. From the top, it can be seen that when a call comes in (e.g., attach or detach), it will go through the POJO Cache interceptor stack first. After that, it will store the object’s fields into the underlying Plain Cache, which will be replicated (if enabled) using JGroups.
Figure 2 POJO Cache architecture overview
3. Object Relationship Management
As JBoss cache allows for any type of object relationship available in the Java language to be transparently handled. During the mapping process, all object references are checked to see if they are already stored in the cache. If already stored, instead of duplicating the data, a reference to the original object is written in the cache. All referenced objects are reference counted, so they will be removed once they are no longer referenced.
To look at one example, let’s say that multiple Persons (“Mary” and “Peter”) objects can own the same office address. The following diagram is a graphical representation of the physical cache data. As can be seen, their address is only stored once.
Figure 3. Object Relationship Management
3.3 Cache Eviction Algorithms
Except the cache data structure, probably the most key issue for a cache is selection of the algorithm used to decide which data to retain in the cache. To decide which algorithm is best for your purposes, it’s good to know your data access patterns. A variety of cache expiration mechanisms have been used for object caching purpose. These algorithms are based on criteria such as least frequently used (LFU), least recently used (LRU), most recently used (MRU), first in first out (FIFO), last access time, and object size etc. Each algorithm has advantages and disadvantages. LFU and LRU are simple, but they don’t consider the object size. A size-based algorithm removes big objects (that require much memory), but the byte-hit rate will be low. It’s important to consider all the web application’s data access patterns before deciding which cache algorithm to be used for expiring cached objects .
FIFO (First In First Out): Items are added to the cache as they are accessed, putting them in a queue or buffer and not changing their location in the buffer; when the cache is full, items are ejected in the order they were added. Cache access overhead is constant time regardless of the size of the cache. The advantage of this algorithm is that it’s simple and fast; it can be implemented using just an array and an index. The disadvantage is that it’s not very smart; it doesn’t make any effort to keep more commonly used items in cache.
Summary for FIFO: fast, not adaptive, not scan resistant.
LRU (Least Recently Used): Items are added to the cache as they are accessed; when the cache is full, the least recently used item is ejected. Cache access overhead is against constant time. This algorithm is simple and fast, and it has a significant advantage over FIFO in being able to adapt somewhat to the data access pattern; frequently used items are less likely to be ejected from the cache. The main disadvantage is that it can still get filled up with items that are unlikely to be reaccessed soon; in particular, it can become useless in the face of scans over a larger number of items than fit in the cache. Nonetheless, this is by far the most frequently used caching algorithm .
Summary for LRU: fast, adaptive, not scan resistant
LRU2 (Least Recently Used Twice): Items are added to the main cache the second time they are accessed. Because of the need to track the two most recent accesses, access overhead increases logarithmically with cache size, which can be a disadvantage. The advantage is that it adapts to changing data patterns, like LRU, and in addition won’t fill up from scanning accesses, since items aren’t retained in the main cache unless they’ve been accessed more than once.
Summary for LRU2: not especially fast, adaptive, scan resistant
2Q (Two Queues): Items are added to an LRU cache as they are accessed. If accessed again, they are moved to a second, larger, LRU cache. Items are typically ejected so as to keep the first cache at about 1/3 the size of the second. This algorithm attempts to provide the advantages of LRU2 while keeping cache access overhead constant, rather than having it increase with cache size. Published data seems to indicate that it largely succeeds.
Summary for 2Q: fairly fast, adaptive, scan resistant
LFU (Least Frequently Used): Frequency of use data is kept on all items. The most frequently used items are kept in the cache. The advantage is that long term usage patterns are captured well, incidentally making the algorithm scan resistant as well; the disadvantage, besides the larger access overhead, is that the algorithm doesn’t adapt quickly to changing usage patterns, and in particular doesn’t help with temporally clustered accesses.
Summary for LFU: not fast, captures frequency of use, scan resistant
Simple time-based expiration: Data in the cache is invalidated based on absolute time periods. Items are added to the cache, and remains in the cache for a specific amount of time.
Summary for Simple time-based expiration: Fast, not adaptive, not scan resistant.
Extended time-based expiration: Data in the cache is invalidated based on relative time periods. Items are added to the cache, and remains in the cache until they are invalidated at certain points in time, such as every five minutes, each day at 12:00AM etc.
Summary for Extended time-based expiration: Fast, not adaptive, not scan resistant.
TTL (Time to live) expiration: Data in the cache is invalidated by specifying the amount of time the item is allowed to be idle in the cache after last access time .
Summary for sliding time-based expiration: Fast, adaptive, not scan resistant.
Working set: The cache is periodically checked, recently access members are considered part of the “working set”. Members not in the working set are candidates for removal. Size of cache is not defined directly; rather the frequency of the periodic checks indirectly controls how many items are deleted.
Summary for Working Set: Fast, adaptive, theoretically near optimal, not scan resistant.
Other algorithms: there are other caching algorithms available that have been tested in published papers. Some of the popular ones include CLOCK, GCLOCK (Generalized CLOCK), and LRD (Least Reference Density). Of possible interest is IBM’s ARC (Adaptive Replacement Cache) which includes some useful tables giving overhead times and hit ratios as functions of cache size and some other parameters .
3.4 Cache Loader
In conjunction with eviction policies, a cache loader allows a user to maintain a bounded cache for a large backend datastore. Frequently used data is fetched from the datastore into the cache, and the least used data is evicted, in order to provide fast access to frequently accessed data. With a cache loader, the Java object cache automatically determines if an object needs to be loaded into the cache when the object is requested.
Two kinds of locking are normally adopted for Java cache framework: pessimistic locking and optimistically locking.
Pessimistic locking is an approach where an entity is locked for the entire time during a transaction. A lock either limits or prevents other users from working with the entity. A write lock indicates that the holder of the lock intends to update the entity and disallows anyone from reading, updating, or deleting the entity. A read lock indicates that the holder of the lock does not want the entity to change while the hold the lock, allowing others to read the entity but not update or delete it. Normally read lock is applied in Java cache framework.
Optimistic lock, which involved versioning data and maintaining copies for each transaction, validating copies upon transaction commit with the cached entity. This approach led to a very highly concurrent setup for a read-heavy system where readers are never blocked by concurrent writers, and also overcame the potential for deadlocks which may occur in pessimistically locked cache framework.
3.6 Clustering and Distributed Cache
Clustering is a very important feature of enterprise grade Java cache. In a clustering environment, keeping all the cluster members’ cached data in sync is crucial.
Distributed cache takes clustered caches a step further. They distribute any cached state across a cluster to maximize retrieval efficiency, reduce overall memory used, and guarantee data redundancy. This means data could be fragmented and fragments stored on different cache instances. There is no requirement that a piece of data is resident on every cache instance in the cluster.
The advantage of distributed cache is that cache itself can be kept on different JVM from the application server; this can save much memory space. The disadvantage of the distributed cache is that the network communication is always the performance bottleneck.
4. JAVA CACHE EVALUATION
4.1 Choose Proper Java Cache Framework
Normally, for an internal small-medium size Java application with limit concurrent users, we don’t suggest to consider the cache layer, because the performance might be good enough without cache. But if this small-medium size application needs to take a few hours to generate a monthly report, and this report would be accessed frequently for the next 6 months, then the cache framework would become necessary.
If many concurrent users are expected for an application, then cache must be a must component to be included in the architecture design. If it is a public web application/portal, cache selection will be more critical.
We need to understand the properties of the project first like scale, data access pattern and performance requirement etc. before we decide what and how a Java cache framework should be used. Here we can also take JBoss cache as an example. As we have discussed before, JBoss cache has two kinds of cache: Plain cache and POJO cache. We cannot simply say that POJO is better than plain object. The issue of fine-grained replication must have a significant impact on performance. Fine-grained replication can really help if you have large and complex objects in your cache. But if you just use it to store strings, it is of little value. Similarly, for simple custom objects – for example a Person class that has just two String fields, POJO Cache would be more of an overhead (field interception, etc.) than a benefit.
However we also need to consider the disadvantages of Java cache. Object caching also includes a few disadvantages, such as memory size, for example. The cache may consume significant heap space in the application server. JVM memory size can become unacceptably huge if a lot of unused data is in the cache and not released from memory at regular intervals.
Another disadvantage is synchronization complexity. Depending on the kind of data, complexity increases because consistency between the cached data’s state and the data source’s original data must be ensured. Otherwise, the cached data can fall out of sync with the actual data, which leads to data inaccuracies.
4.2 Open Source Java Cache Frameworks
There are many open source cache frameworks available and some of them are powerful enough to compete with commercial Java cache products in the market. These open sources have some common features like in memory key value cache, support multiple eviction algorithms, high speed read and low speed put, clustering and distributed etc. Different caches implement these features in different ways and more and more frameworks are having clustering and distributed features. Except above mentioned JBoss Cache, JCS (Java Caching System) and Memcached are also powerful and they also have clustering and distributed features. JCache Open Source is an effort to make an open source version of JSR-107. Since the JSR-107 hasn’t been released any specs for years, this version still builds on the original Functional Specification. EHCache is a pure Java, in-process cache acting as a pluggable cache for Hibernate. There are also some other good open source Java caching frameworks like Cache4j, Open Terracotta, SwarmCache, WhirlyCache, OSCache, ShiftOne and SHOP.COM Cache etc . How to judge which one is the best fit for your project will be a challenge work. In the next section, we will discuss the evaluation methods to select your Java caching framework.
4.3 Java Cache Evaluation
In front of the long list of open source and commercial Java cache frameworks, you have to decide which one is the best fit for your project. As we mentioned here before, you need to consider characters of both the project and the Java cache frameworks.
Please check below table regarding to the mapping between the project requirements and Java caching considerations.
Table 1. Requirement and Caching Mapping
||Java Caching Considerations
|Data access pattern
||Cache data structure, Eviction Algorithms, Transaction and Locking
||Flexible configuration and tunable
|Concurrent user numbers
|Clustering and scalability
||Data replication, Clustering, Distributed cache,
||Eviction Algorithms, Passviation, PersistenceDistributeed cache
||Flexibility of configuration, Extendable framework
Java cache always will come to your picture when you start system analysis or architect design. As we have discussed before, not all the projects need a cache layer especially those small sized internal project. Caching should be considered as a component in your architecture if the system has objects that can be shared across requests and users.
If you decide to involve caching framework into your architecture, we propose you can consider evaluating a caching framework from following aspects.
Cache access overhead time: Here we propose a cache overhead evaluation index formula:
Ic = Cj / Dj
Ic indicts the cache overhead evaluation index;
Cj indicts the cache access overhead time;
Dj indicts the time spent on querying data directly from the data source.
Based on the Ic value, we can judge our cache strategies:
When Ic > 1, it means that it will take more time to query data from the cache then directly querying it from datasource; So cache might not be necessary for this application;
When Ic =1, cache is optional for your application.
When Ic < 1, it means cache is a good choice for your architecture. If Ic << 1, then cache will be a must.
In a distributed cache environment, you need also to calculate the time spent on querying data from a distributed cache. It might be possible that we have Ic <1 for local JVM, but have Ic>1 for the cache from a distributed JVM, in this case, distributed cache is not a good choice for your architecture. Distribute cache can be selected unless Ic is always less than 1.
Performance: high performance is always required by end users, so the Java cache is always expected. However when we evaluate a caching framework, we should also consider the performance of the caching itself, so selecting a mature Java cache framework is always the first principle.
Data access pattern: (1) As we have discussed in section 2, only the third Java object form can be cached; (2) if data access has a time based feature like typical stock or forex data, then time based eviction algorithms should be considered; (3) If the data access has a hit-rat feature like online store, the popular goods should always be cached, so the frequency based eviction algorithms should be considered like LFU; (4) If the memory has some limit then the size based algorithms should be preferred; (5) If data access has a sequence feature then FIFO is a better choice; (6) If the application has some different data access patterns, then an evaluation needs to be done to see which data access pattern affects the performance the most. Normally “Working Set” algorithm can be considered in a mixed data access pattern environment. So normally, selecting a cache framework supporting multiple eviction algorithms would be very important and flexible.
Project end-user patterns: for an internal application, it would be easier to get to know user data access patterns, and we can easily identify the features of working data set. But for a public web application, end-users’ behaviors are normally unexpected, in this case, a configurable and tunable caching framework would be more fit.
Concurrent user numbers: as concurrent user number increases, normally the performance will be slow down because a large amount of heap will be occupied by so many sessions. Sessions and caching are competing the memory space and it is possible that out of memory can be occurred if no space available for any of them. In this case, we need to balance both system stability and the performance. The algorithm parameters need to be tuned during performance testing period .
Clustering and scalability: we should always select a cache framework supporting clustering in case higher performance is required or customer wants the application grow larger and larger.
Memory size: the memory assigned to JVM is always limited, so we need to calculate/evaluate how many should be allocated for caching data. Based on the memory calculated, different eviction algorithm can be evaluated.
Requirement change: considering the requirement is always changing, so the flexibility to configure the cache framework parameters like eviction algorithms, passvition, cache loader etc would be very critical. It would be better the cache framework has an extendable framework; this can help developers to include their own algorithms if necessary.
In order to improve server performance and provide high quality user satisfaction, it is essential to include the caching in the architecture design. In this paper, we have discussed the key elements of enterprise Java caching framework as well as some popular open source cache components, and finally we propose a way to evaluate a Java caching framework from the perspective of mapping the need of projects requirement.
 Java Temporary Cache API (JSR107 – http://www.jcp.org).
 Jerry Bortvedt. Functional Specification for Object Caching Service for Java (OCS4J) (2.0), 3.
 Srini Penchikala. Implementing Object Caching with AOP, September 2004, http://www.theserverside.com/tt/articles/article.tss?l=ObjectCachingWithAOP.
 Greg Nudelman. The timestamp-based caching framework: current data with peak performance. Java World, January 2005.
 Nimrod Megiddo, Dharmendra S. Modha. ARC: A Self-Tuning, Low Overhead Replacement Cache, Conference On File And Storage Technologies, San Francisco, CA, 2003.
 Iliyan Nenov, Panayot Dobrikov. MEMORY SENSITIVE CACHING IN JAVA. International Conference on Computer Systems and Technologies, CompSysTech, 2005.
 Skill Level, Build faster Web applications with caching Cache frequently viewed data with Java Caching System, IBM Developer Works, Dec 2008.
 Open Source Cache Solutions, http://Java-source.net/open-source/cache-solutions.
Caching Strategies, http://faq.Javaranch.com/Java/CachingStrategies
7. ABOUT AUTHOR
Jason Zhang @ LinkedIn: http://cn.linkedin.com/pub/jason-zhang/9/436/a63