An ideal cache entry is created just before its value is first needed and destroyed just after it is no longer needed. typical cache strategies do not (try to) approximate this ideal, but half a loaf is better than none?
caching doesn't necessarily help. for example, using another host to cache database query results requires more network usage. determining if the additional complexity and cost are worthwhile depends on the database server's behavior (it might be caching the results already), network connection speeds, etc.
in a simpler architecture, a server and its cache are hosted by the same machine. if there are redundant servers there will be duplicate cache entries, so analyzing cost/performance still requires effort. there is no free lunch.
Server requirements
caching should improve performance without impacting functionality. a server that caches values should not:
- return a misleading response when a value is stale
- fail to respond on time when a cache value is missing
when a value is missing, a server may return 200 as long as the response is clearly incomplete. we may use null to indicate that a value is missing.
though unusual in this context, this interface convention is quite common in general. when a value is missing because of a third-party service outage, immediately returning a partial response is optimal.
Server designs
servers typically use request methods to (re)initialize values. this design ensures that the server doesn't do unnecessary work, but it can:
- induce request methods to race to (re)initialize the same cache value
- introduce complexity/delay when responses depend on multiple data sources
cache entries often have similar lifecycles, so request processing can induce a request method to (re)initialize multiple expired cache entries. the server's response is delayed if (re)initializing any of these entries requires significant time, even when the request method is async.
(my server needs more than 10 seconds to reinitialize one of its cache entries.)
an alternative design decouples request processing from cache updating. in this design, cache entries are updated independently and request methods construct responses from cached values that may be null.
Peeker pattern
the observer pattern uses callback functions to react to state changes. this observation technique is intrusive, so i think the pattern was named badly. monitor pattern is also in use already, so let's call a decoupled observer a peeker.
the peeker pattern integrates neatly with the facade pattern, i.e. request methods can use a facade object to read cache values. the facade object also (re)initializes its cache entries when appropriate.
(i found the iconic Darryl Revok image when scanner was a candidate, and had to use it somehow. split brain was my first choice ... naming things is hard :-)
Server threads
a peeker is very fast, so server request methods are synchronous. on the other hand, asynchronous processing greatly facilitates cache maintenance.
a cache entry becomes invalid when we discover its new value via notification (observer pattern), or it expires (TTL). thus, using an async Python method to maintain a cache would require significant effort.
on the other hand, many caches can be maintained with per-entry worker threads. this design is easy to implement and it performs well, because Python's global interpreter lock limits performance only when tasks are CPU-bound.
(for TTL-based cache entry reinitialization, i override the Thread.run() method in a threading.Timer subclass.)
TL;DR: Guido was right.
Comments
Post a Comment