Tag Archives: caching

Caching WordPress Data with the Transients API

If you’re a plugin or theme developer, there may come a time when you need to execute a long running operation. It doesn’t need to be anything complicated, but something as simple as fetching a Twitter feed can take a significant amount of time. When you come across these types of situations, it’s handy to be able to store the data on your own server and then fetch a new copy of it every X hours. This is called caching, and WordPress convieniently comes packages with an excellent caching API called Transients .
Wordpress Transients API

The Transients API is surprisingly simple to use. In fact, it’s very much like using set_option and get_option except with an expiration time. If you aren’t familiar with caching at all, here’s the general workflow:

  1. If the data exists in the cache and isn’t expired, get it.
  2. If the data doesn’t exist in the cache or is expired, perform the necessary actions to get the data.
  3. Store the data in the cache if it doesn’t already exist or is expired.
  4. Continue from here using data.

When attempting to use the Transients API for caching, there are three functions that you need to be aware of: set_transient, get_transient, and delete_transient.

  • set_transient($identifier, $data, $expiration_in_minutes): This function stores your data into the database. The identifier is a string that uniquely identifies your data. Your data can be any sort of complex object, so long as it is serializable. The expiration is how long your want your data to be valid (ex: 12 hours would be 60*60*12).
  • get_transient($identifier): This retrieves your data. If the data doesn’t exist or the expiration time has passed, false is returned. Otherwise, the same data you stored will be returned.
  • delete_transient($identifier): This will delete your data before it’s expiration time. This is handy if you are storing post-dependent data because you can hook it into the save action so that every time you save a post, your cached data is cleared.

Now that we’ve covered the basics, how about a quick example?

if (false === ( $my_data = get_transient('super_expensive_operation_data') ) ) {
     $my_data = do_stuff();
     set_transient('super_expensive_operation_data', $my_data, 60*60*12);
}
 
echo $my_data;
 
function do_stuff() {
     $x = 0;
     for($i = 0; $i != 999999999; $i++) {
          $x = $x * $i;
     }
     return $x;
}

The example is pretty straight forward. We first check to see if there is a cached copy of the data, if not, we fetch the data from the “do_stuff” function, and store it in the database. Simple, right?

One of the benefits of using the Transients API (aside from speeding your site up) is that plugins like WP Super Cache or WP Total Cache will auto-magically cache your data into memcached if you have it set up. For you, this means an even faster site! If you have any questions about caching techniques or the Transients API, leave a comment and I’d be happy to help.

Creating a Flexible Caching Module

I work for a great company, I really do.  Sure we have our problems (like just coming out of the developer stone age), but overall I work with smart, friendly people who are passionate about what they do.  One of the things that always bugged me though was the lack of any sort of caching in our CMS.  Most times it’s not needed, but when an operation takes about 100 queries or so to finish, then it’s time to start caching.

Since I’m a bit of an efficiency freak, I thought I would take a crack at writing a flexible caching module that is easy for our developers to use.  So what do “easy” and “flexible” mean?  To be “easy”, the caching module must be usable by even a novice developer and have a limited number of options.  For instance, I ended up deciding that we really only need two public methods, and one public property.

  • $cache->exists – If “$cache” is a cache object, calling exists checks to see if the cached object already exists in the database.  It also checks to see if it’s expired or not.  If it’s expired or non-existent, it returns false.   It returns true if the cached object exists and is up to date.
  • $cache->put($val) – This is how you store something in cache.  It can be any type of serializable PHP object.  So basically, resource types are off limits but objects, arrays, variables, entire web pages, etc. can be used.
  • $cache->get() – This fetches the object stored  in the cache.  It handles the re-serialization of it as well, so it really makes things pretty idiot proof.

What about flexible?  Well, by that I mean we need to be able to transparently implement several different types of caching.  Since we’re just crawling out of the dark ages, I opted to implement a fallback caching mechanism.  Here’s how it works.

  • The programmer defines a variable in our settings area to be which caching option he/she wants to use.  Options are memcached, file, mysql.
  • If the setting isn’t defined, we try memcached by default.  This is by far the best caching system to use, so it makes sense to try it first.
  • If memcached fails, we go to a database caching schema.  While not nearly as good as using memcached, it’s possible it could save you tons of queries on your database.
  • If the user chooses file caching, we do that.  It’s a pretty bad idea to use in most cases, but may still have it’s uses.

So why did it come this?  It’s not that we host terribly high-volume sites, but that our CMS is super slow.  A full-on page will take about 4 seconds to load to your screen completely, and that’s running local on the network.  One of the main problems is that we use output bufferring extensively.  The ENTIRE FRIGGIN PAGE is buffered.  This has 3 side effects:

  1. Slight performance loss due to bufferring.
  2. Apparent page load time sucks because the browser has to wait for the entire page to be generated before getting output.
  3. Development is super easy because you don’t ever have to worry about output being sent before doing a call like “header()”.

We can’t remove output buffering unfortunately.  It’s at the very core of our CMS and development practices, so it just won’t work.  To get the load time to generate the page as low as possible, I decided that caching was needed.

So what sort of problems do we run in to with this caching module?  Glad you asked!  Many of the problems aren’t specific to this caching module, but to caching in general.  The quick list:

  • If the original query wasn’t complicated, it’s not worth storing the results.  The number of queries the caching module does in MySQL mode is 3.  If your initial query was less than that, or not a super-complex-mega-join, it’s not worth using.  This caveat goes away in memcached mode.
  • Smart naming and design.  You have to be very careful out what you cache, and when.  Remember, page content and queries probably change when a user is logged in or on a different device.  Just things to keep in mind.
  • Getting developers to use it.  Not everyone likes to learn, let alone change their habits.  The biggest barrier to this is getting people to use it.  Some people don’t care about efficiency either (sad, I know), but at least our system administrator thanks me.

The caching module seems to work pretty well too.  On one particularly SQL heavy page, I reduced page load time from 14 seconds (ridiculous) to just under 6 seconds (still bad, but getting better).

That’s it for now.  Any questions or comments are welcome.