Saturday, March 31, 2012

Introduction to WCS Cache Invalidation - Part 1

"There are only two hard problems in Computer Science: cache invalidation and naming things."
-- Phil Karlton

As the quote above suggests cache invalidation can be a complex topic.
In this post will discuss the basics of cache invalidation; along with some guidelines for the different invalidation strategies that can be used with WebSphere Commerce.
This guide assumes some familiarity with configuring caching in WebSphere.

What is cache invalidation ?
Essentially it is the mechanism of removing the stale items from cache.
WebSphere Commerce has a caching system called DynaCache.

Why is invalidation important?
To protect the accuracy of the data available in cache. 
Most business applications have a consistency requirement for certain types of data.
For example, If my inventory reaches zero, I don't want the customer to purchase items which have no inventory and which can not be back-ordered.

How can I invalidate cache?
Here is a list of the different ways you can invalidate cache:
  1. Manual Invalidation
  2. Time to Live (TTL)
  3. Dependency IDs
    1. Using command invalidation
    2. Using CACHEIVL (only in WebSphere Commerce)
    3. Using custom code
Manual Invalidation
This type of invalidation is done by using the WebSphere Commerce Cache Monitor.
The cache monitor is a web application which uses the DynaCache API to remove specific entries.

The EAR file comes with WebSphere Application Server but it is not installed by default.
Make sure servlet caching is enabled in your WCS environment before installing the Cache Monitor.

The advantage of this method is that you can invalidate all entries at the same time by pushing a button.
You can also invalidate only a given group of entries. This is a more complex topic which will be explained in a later post.

The disadvantage is that if you have a large number of entries it will be difficult to find a specific entry.
There is no search capability and the navigation is not very user friendly.

Invalidation using a TTL
The TTL value is configured in cachespec.xml.
A <timeout> child element is added to a <cache-entry> element. The timeout is specified in seconds.
This type of invalidation is best used when you don't have a hard consistency/freshness requirement.

Let's use the caching of the product display page as an example
For this example we will assume the following:
  1. The product table is updated once a day
  2. The update is done by a scheduled job
  3. The scheduled job runs once a day at 10pm
In this case it would make sense to set the TTL for ProductDisplay to 86400 (seconds in a day):



The timeout starts from the time each cache entry is created. Each product has a separate cache entry which is created when a page for a specific product id or part number is accessed. Since each cache entry could be created at a different time, they could also expire at a different time.

The advantage of this method is that it is easier to configure compared to dependency ID invalidation.

The main disadvantage of using TTL invalidation is that there is no guarantee that the cache will expire before the product table is updated.

Here is a scenario where the cache would not be properly invalidated:
  1. Most cache entries expire around midnight.
  2. The scheduled job is delayed due to some problem.
  3. A user access the product page for part number ABC at 12:30am creating a new cache entry
  4. The scheduled job updates the data for ABC and finishes running at 1am
  5. The product page for ABC is not updated because the timeout value was reset at 12:30am

Ideally the product update job should always run before the cache expires, but in this case it didn't.
The easiest solution in this case is to do a manual cache invalidation at 1am after the scheduled job has finished running.

Another disadvantage is that all the entries will expire around the same time. This will cause a large number of requests to hit WebSphere Commerce at the same time without the benefit of a cached response. This will degrade the performance perceived by the user.

In conclusion this method is best used when:
  1. You have a low consistency business requirement: The cache can be updated sometime after the database is updated.
  2. You have the operational resources: Someone is watching the scheduled job to see if it is delayed and if something goes wrong he knows to manually invalidate the cache.

NOTE: The <timeout> value is separate and different from the <inactivity> value. It is possible to have both settings. The <inactivity> value will expire the cache after it has not been accessed for a given number of seconds.

Stay tuned for the next post which will cover dependency ID based cache invalidation ...

3 comments:

  1. How to invalidate cached using CACHEIVL

    ReplyDelete
  2. Hi, I found these topics in infocenter while reading DynaCache. Hope they will answer your question on how to use CACHEIVL to invalidate cache

    http://publib.boulder.ibm.com/infocenter/wchelp/v7r0m0/index.jsp?topic=/com.ibm.commerce.developer.doc/refs/rdcinvalid.htm

    http://publib.boulder.ibm.com/infocenter/wchelp/v6r0m0/index.jsp?topic=/com.ibm.commerce.admin.doc/tutorial/tdcperf3c.htm

    ReplyDelete
  3. OOB cacheInvalidation scheduler used to run in the WCS which will invalidate the cache.Only you need to enter the correct dataId in the CACHEIVL table.

    ReplyDelete