Handling the Digg Effect with WordPress Caching

Last week, I was very proud to have my Subversion Quick Reference guide hit the Digg front page, which was a first for me! I’ve been on the del.icio.us frontpage a number of times, but nothing else compares to being dugg.

Even though my article hit the frontpage during a non-peak time, I still received over 3,000 hits within a fairly short period of time. Sure, I didn’t get the huge digg effect, but I was still very surprised that my little wordpress blog hosted on dreamhost handled the digg effect with no problems whatsoever.

I have since come to the conclusion that the reason my blog held up to the traffic because I very aggresively use the WordPress caching module. Here’s the list of things I’ve done that helped me keep my blog afloat.

1) Install the WordPress Caching module if it’s not already installed.

If you haven’t already got the WordPress Caching module installed, you will want to get it and install it now. Note: Dreamhost installs it by default, so all you have to do is turn it on if you are a dreamhost customer.

If you don’t have the module, you can follow the instructions from the WP-Cache 2.0 page to install it. Once you’ve installed the plugin, you’ll want to click the activate button.

 

2) Configure the Cache Timeout to infinite

Unless you have content that requires constant updates, there’s really no good reason that you can’t change the caching time to infinite, because the caching module is smart enough to remove the pre-cached page if you update a post, or it will remove the cached version of the index page

Also, the WP-Cache module provides a syntax for executing dynamic code on static pages. You can include a function into the cached pages that will still execute dynamically, or you can include a php file that will always execute. The syntax works something like this:

<!–mclude ads.php–>
<?php include_once(ABSPATH . ‘ads.php’); ?>
<!–/mclude–>

With that background information, we can move on to configuring the timeout setting. This setting determines how long the pages stay cached before WordPress automatically reloads them. The issue with this from my standpoint is that during peak loads, if a user decides to click on another link on your site, WordPress will potentially reload that link if the timeout period has expired. I hate when I load up a page, click on another link on their site, and it either takes forever to load, or fails entirely.

In the administration panel, browse over to Options, and then to WP-Cache. I typically just type in a whole bunch of 9s into the expiration field, cause it’s simpler than remembering some number.

Click the Change Expiration button, and we’re good to go. We now have cached our blog to the point where it’s nearly static…. except it isn’t.

 

3) Delete and reload the cache when you change things.

Clearly we don’t want to revert back to a completely static blog, we’ve grown up since the days of Geocities accounts. Do you remember which Geocities server you were on? I think I was on the siliconvalley one, but I can’t remember.

There are a few options for how we can reload the cache once something is actually updated. First, if we are making changes to the structure of the page…. adding new links to the sidebar, moving elements around… then we will want to reload the entire cache. There’s an easy way to do so in the WP-Cache options panel… just click the Delete Cache button.

 

Alternatively, there’s another way to do this in a more automated fashion. I prefer to set a certain time for reloading the cache. It can be nightly, hourly, or whatever time period you would like to run it.

WP-Cache stores it’s files in the directory blogdirectory/wp-content/cache/ if we take a peek in there, we should see something like this:

wp-cache-685c5b56dc5181ffc5f2fe4753f9f25d.html wp-cache-e14d8b7d4c4a23bcdf808c8a5e41ec9e.html
wp-cache-685c5b56dc5181ffc5f2fe4753f9f25d.meta wp-cache-e14d8b7d4c4a23bcdf808c8a5e41ec9e.meta
wp-cache-692dd0426c4eda24b4d68f367fb100fb.html wp-cache-e4ca2bec97c1638bc9ccd6cffea0722d.html
wp-cache-692dd0426c4eda24b4d68f367fb100fb.meta wp-cache-e4ca2bec97c1638bc9ccd6cffea0722d.meta
wp-cache-6a318f2b596c174d8d45833c408d443b.html wp-cache-e9a3a0001759927ce99febae0e7df1ec.html

All we need to do to delete the cache, is make a cron that runs rm wp-cache* in that directory. I’ll leave that exercise up to you.

But now we have another problem…. we’ve deleted the cache, which means the entire website isn’t cached anymore. Here’s where a handy php script and the Google Sitemaps Plugin comes into play!  You are using Google Sitemaps aren’t you? If not, you really should be. I’m going to assume that you’ve installed it, and we’ll proceed.

I’ve created a little PHP script that is Dreamhost friendly. Dreamhost blocks URL access through the file methods, so you can’t just use something like file_get_contents(url). Thankfully they do provide a workaround on their wiki, which I incorporated into this script:

<?

$xmldata = get_url_contents("http://www.mysitename.com/sitemap.xml");
$xml = simplexml_load_string($xmldata);

$cnt = count($xml->url);
for($i = 0;$i < $cnt;$i++){
        $tmp = get_url_contents($xml->url[$i]->loc);
}
echo("Completed!");

function get_url_contents($url){

        $ch = curl_init();
        $timeout = 5; // set to zero for no timeout
        curl_setopt ($ch, CURLOPT_URL,$url);
        curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
        $str = curl_exec($ch);
        curl_close($ch);
        return $str;
}
?>

This script will pull down your Google Sitemap, which contains a list of all the pages in your site. It will then make a request to each page in your site, at which point WordPress will automatically cache each and every page in your site. If we check the WP-Cache panel, we should see… sure enough, we’ve got 35 cached pages now.

Go on, check out the speed difference in your website. If you haven’t been caching, you’ll probably be amazed.

22 Responses to “Handling the Digg Effect with WordPress Caching”

  1. Johnny Says:

    Koby,

    Dreamhost has php4 and php5, you’ll want to run it with the php5 version, which is located at:

    /usr/local/php5/bin/php

    You can do a php -v to find the version.

  2. amit Says:

    Where do we put the PHP script you wrote?

  3. Frac Says:

    Hmm… could you add some sort of server load check into your cache clearing/reloading scripts so it only reloads if the server has been reasonably quiet for the past x number of minutes?

    That would prevent a reload right in the middle of a digg-effect. Not much chance of that occurring, but you never know.

  4. Koby Says:

    Forgive my ignorance, but that little PHP script throws an error about an “undefined function: simplexml_load_string()”

    I don’t see anywhere in your article about where to put this script or if it is to be included in a file, what file that is.

    Did I completely miss something? Can you explain where this script should go exactly

  5. Brennan Stehling Says:

    A week back my site got listed on the Digg.com homepage and I did not even know what was happening. I tried to SSH into my server but it was overrun. I am running WordPress, but what I did to fix it was to limit the Apache web server to only 10 forked children. Before it seems it was spawning more children than the server could handle in an attempt to keep up with requests.

    Hopefully next time the server gets pounded it will be alright.

  6. Koby Says:

    Ok, nevermind, I figured out it’s a PHP5 thing which my server doesn’t support.

  7. Johnny Says:

    The caching plugin is smart enough to reset the cache once you’ve approved comments. You can set a couple different options on the comments…. I’ve found that if I don’t moderate them that I’ll end up with a ton of spam.

  8. maxpower Says:

    I’ve gone through the exact same thing as you but I concluded that in addition to wp-cache, it is also a good idea to use the ‘Coralize this’ plugin which redirects all or some of your page and site contents to the cache servers. Very handy.

    Also, doesn’t wp-cache cache comments too? SO if you set the reset time to infinity, how will people know there have been comments?

  9. Johnny Says:

    Chris,

    There’s a problem with dreamhost and the caching module that I should have mentioned… I’m copying this off the wiki:

    If you have “blank pages” in WordPress with wp-cache turned on after you upgrage to PHP 5.1.2 – there is simple fix to solve the problem:-

    1. Open wp-cache-phase2.php file* in your favourite text editor
    2. Find out wp_cache_ob_end function
    3. then inside that function find out line with: ob_end_clean(); (it should be line 219 or about)
    4. and finally replace that line with: ob_end_flush();

  10. Chris Says:

    I have done everything, but for some reason after clearing the cache and reloading via your script, every page I go to for the first time shows up as blank.

  11. Nick Georgakis Says:

    Hello Johnny,
    I have further optimized WP-Cache to handle better the “Digg Effect” by adding support of serving highly pre – compressed pages reducing both serving time and bandwidth required per visitor.
    The required modifications are described at my blog page – Modifying WP-Cache 2.0 to generate and cache gzipped output once and serve it multiple times

  12. Everton Says:

    I’ve just written a post that explains how to make wordpress pages faster and how to make better use of caching to protect against being ‘Dugg’ – worth having a look.

    http://www.connectedinternet.co.uk/2006/11/05/1028/

  13. xytsun Says:

    http://www.e-fanyi.net翻译公�
    http://www.e-fanyi.net/index02.htm北京翻译公�

  14. Tim Says:

    Thanks for the script! What does it mean if all the cache files are made but only the feeds show up in the cache admin panel?

  15. Brian Turner Says:

    Thanks for the great script suggestion – am going to try this on a few of my installs. Hat tip to SEOmoz for the shout out for this neat plugin. :)

  16. Rich Says:

    I had this happen to me, 2,000 Diggs and nearly 20,000 hits in one day and WordPress is shut down at the moment since it was conflicting with my other sites. Do you know of a solution? I can not reactivate the subdomain, otherwise the hits start pouring in and the server crashes.

  17. bilety lotnicze Says:

    “Handling the Digg Effect with WordPress Caching” – Good work. Cogratulations

  18. Aurelius Tjin Says:

    Thanks for the heads up. It really pays to be updated with the latest news on technology and keep abreast with the innovations.

  19. David Ryder Says:

    Awesome – appears to work beautifully

    of course, I’ll have to be dugg or slashdotted to know for sure :-)

  20. sabul Says:

    thanks also for the explaination..my WP blog also shows improvement

  21. Aksi Lucah Says:

    wow, this comment is very nice !!

  22. V_RocKs Says:

    I have to laugh at how many people didn’t know what to do with the little piece of code you wrote about.

    So sad…