Cache Rewrite

Return home.

Contents

This is a smaller project than others documented here, but is a good example of careful design that makes a significant improvement under strong constraints (being unable to change the API syntax)

Original Cache

A singleton, in-memory, Python cache provided an interface to a remote service. The remote service returned fragments of data from a nested set of lists and tables. This was converted by the cache into a graph of domain model instances.

The cache had the following problems:

Replacement Cache

Overview

Memcache is used as a distributed cache to the underlying service. Data are not processed before being stored - the cache is a very simple image of each call and response.

The replacement cache is smaller, with a shorter lifetime. It does not need to be large because Memcache stores bulk data. Instead, a cache instance is used for a single task, for which it builds a single, consistent, generationed graph of domain objects.

The data were structured as a forest. Clients typically required either general information about all root nodes, or specific information about one tree. So the cache is simplified by redirecting all calls to do one of two things: retrieve general data about all trees, or all data about one tree. The API then returns the appropriate fragment from this graph.

To support the above, the underlying service had to be modified to expose generations of data (which were already present internally). A second, small service monitors the source and provides data on generations to clients. If this service is not running (or the original, unmodified source is used), then the new cache continues to function, but cannot guarantee data from a single generation.

Implementation

The new cache is implemented as a series of classes, each inheriting from the previous to add new functionality. The lowest level uses Memcache; the middle level caches tree nodes; the top level caches tree details.

Since all instances provide the same interface they can be used in existing tests (that I had written to try understand and control issues in the original client) to ensure that each modification preserves (or improves) the cache semantics.

Delegation was originally chosen instead of inheritance, but the latter proved to be cleaner (it simplifies the creation of a cache from an existing instance for a new task).

Conclusion

Every problem in the original cache was removed, while keeping the same API syntax. The API semantics were drastically simplified.

A more detailed account is given in my blog.