CHAPTER 14 – Database Query/Result Caching

Caching the results of database queries can speed up your site and reduce the load on the database server. The biggest challenge is to determine the best caching strategy. Should you cache the results from every single query? Do you know in advance which queries are going to be expensive? The following example demonstrates an approach to this using the Cache_DB class, which is part of the Cache PEAR package. It wraps a DB con- nection object inside a proxy object that intercepts query() calls and uses a Strategy pattern to determine a caching strategy for each query: <?php require_once ''DB.php''; require_once ''Cache/DB.php''; abstract class QueryStrategy { protected $cache; abstract function query($query, $params); } class Cache1HourQueryStrategy extends QueryStrategy { function __construct($dsn, $cache_options) { $this->cache = new Cache_DB(''file'', $cache_options, 3600); $this->cache->setConnection($dsn); } function query($query, $params = array()) { $hitmiss = $this->cache->isCached(md5($query), ''db_cache'') ? " HIT" : "MISS"; print "Cache 1h $hitmiss: $queryn"; return $this->cache->query($query, $params); } } class Cache5MinQueryStrategy extends QueryStrategy { function __construct($dsn, $cache_options) { $this->cache = new Cache_DB(''file'', $cache_options, 300); $this->cache->setConnection($dsn); } function query($query, $params = array()) { $hitmiss = $this->cache->isCached(md5($query), ''db_cache'') ? " HIT" : "MISS"; print "Cache 5m $hitmiss: $queryn"; return $this->cache->query($query, $params); } } class UncachedQueryStrategy extends QueryStrategy { function __construct($dsn) { $this->cache = DB::connect($dsn); } function query($query, $params = array()) { print "Uncached: $queryn"; return $this->cache->query($query, $params); } } class QueryCacheStrategyWrapper { private $cache_1h = null; private $cache_5m = null; private $direct = null; function __construct($dsn) { $opts = array( ''cache_dir'' => ''/tmp'', ''filename_prefix'' => ''query''); $this->cache_1h = new Cache1HourQueryStrategy($dsn, $opts); $this->cache_5m = new Cache5MinQueryStrategy($dsn, $opts); $this->direct = new UncachedQueryStrategy($dsn); } function query($query, $params = array()) { $obj = $this->cache_5m; $re = ''/s+FROMs+(S+)s*((ASs+)?([A-Z0-9_]+))?(,*)/i''; if (preg_match($re, $query, $m)) { if ($m[1] == ''bids'') { $obj = $this->direct; } elseif ($m[5] == '','') { // a join $obj = $this->cache_1h; } } return $obj->query($query, $params); } function __call($method, $args) { return call_user_func_array(array($this->dbh, $method), $args); } } $dbh = new QueryCacheStrategyWrapper(getenv("DSN")); test_query($dbh, "SELECT * FROM vendors"); test_query($dbh, "SELECT v.name, p.name FROM vendors v, products p". " WHERE p.vendor = v.id"); test_query($dbh, "SELECT * FROM bids"); function test_query($dbh, $query) { $u1 = utime(); $r = $dbh->query($query); $u2 = utime(); printf("elapsed: %.04fsnn", $u2 - $u1); } function utime() { list($usec, $sec) = explode(" ", microtime()); return $sec + (double)$usec; } The QueryCacheStrategyWrapper class implements the Strategy wrapper, and uses a regular expression to determine whether the query should be cached and if it should be cached for five minutes or one hour. If the query con- tains a join across multiple database tables, it is cached for one hour; if it is a SELECT on the bids table (for an auction), the query is not cached. The rest will be cached for five minutes. Here is the output from this example the first time the queries are run, and the results are not cached: Cache 5m MISS: SELECT * FROM vendors elapsed: 0.0222s Cache 1h MISS: SELECT v.name, p.name FROM vendors v, products p WHERE p.vendor = v.id elapsed: 0.0661s Uncached: SELECT * FROM bids WHERE product = 42 elapsed: 0.0013s As you can see, the join is relatively expensive compared to the other queries. Now, look at the timings on the second run: Cache 5m MISS: SELECT * FROM vendors elapsed: 0.0098s Cache 1h MISS: SELECT v.name, p.name FROM vendors v, products p WHERE p.vendor = v.id elapsed: 0.0055s Uncached: SELECT * FROM bids WHERE product = 42 elapsed: 0.0015s The cache gave a 125 percent speed-up for the first query, and a whop- ping 1,100 percent speed-up for the join. A good exercise to complete after reading the APD section, "Profiling with ADP," later in this chapter would be to adapt the caching strategy in your own database (just change the "bids" table name), and use APD to compare the per- formance of the wrapped caching solution with a regular non-caching approach.

Call Caching Call caching means caching the return value of a function given a set of parameters. Both the Cache and Cache_Lite PEAR pack- ages provide this. Chapter 11, "Important PEAR Packages," contains an exam- ple of call caching.  

Compiled Templates Most template systems today compile tem- plates to PHP code before displaying them. This not only makes the template display faster, but it also allows an opcode cache to cache them between requests so they do not need to be parsed on every request. The only template packages in PEAR that do not compile to PHP code are HTML_Template_IT and HTML_Template_PHPLIB. If you use one of the others, such as Smarty or HTML_Template_Flexy, everything will be taken care of for you.  

Output Caching Finally, you may cache the printed output of an entire script or just parts of it using PHP's output buffering functions. Again, the PEAR caching packages have wrappers in place for output caching. See the Cache_Lite example in Chapter 11.

Post Comment
Login to post comments