Memcached or Redis? It's a question that nearly always arises in any discussion about squeezing more performance out of a modern, database-driven Web application. When performance needs to be improved, caching is often the first step employed, and Memcached and Redis are typically the first places to turn.
Let's start with the similarities. Both Memcached and Redis are in-memory, key-value data stores. They both belong to the NoSQL family of data management solutions, and both are based on the same key-value data model. They both keep all data in RAM, which of course makes them supremely useful as a caching layer. In terms of performance, the two data stores are also remarkably similar, exhibiting almost identical characteristics (and metrics) with respect to throughput and latency.
Besides being in-memory, key-value data stores, both Memcached and Redis are mature and hugely popular open source projects. Memcached was originally developed by Brad Fitzpatrick in 2003 for the LiveJournal website. Since then, Memcached has been rewritten in C (the original implementation was in Perl) and put in the public domain, where it has become a cornerstone of modern Web applications. Current development of Memcached is focused on stability and optimizations rather than adding new features.
Redis was created by Salvatore Sanfilippo in 2009, and Sanfilippo remains the lead developer and the sole maintainer of the project today. Redis is sometimes described as "Memcached on steroids," which is hardly surprising considering that parts of Redis were built in response to lessons learned from using Memcached. Redis has more features than Memcached, which makes it more powerful and flexible but also more complex.
Used by many companies and in countless mission-critical production environments, both Memcached and Redis are supported by client libraries implemented in every conceivable programming language, and both are included in a multitude of libraries and packages that developers use. In fact, it's a rare Web stack that does not include built-in support for either Memcached or Redis.
Why are Memcached and Redis so popular? Not only are they extremely effective, they're also relatively simple. Getting started with either Memcached or Redis is considered easy work for a developer. It takes only a few minutes to set them up and get them working with an application. Thus a small investment of time and effort can have an immediate, dramatic impact on performance -- usually by orders of magnitude. A simple solution with a huge benefit: That's as close to magic as you can get.
When to use Memcached
Because Redis is newer and has more features compared to Memcached, Redis is almost always the better choice. But there are two specific scenarios in which Memcached could be preferable. The first is for caching small and static data, such as HTML code fragments. Memcached's internal memory management, while not as sophisticated as Redis', is more efficient because Memcached will consume comparatively less memory resources for metadata. Strings, which are the only data type that are supported by Memcached, are ideal for storing data that's only being read because strings require no further processing.
The second scenario in which Memcached still has a slight advantage over Redis is horizontal scaling. Due in part to its design and in part to its simpler capabilities, Memcached is much easier to scale. That said, there are several tested and accepted approaches to scaling Redis beyond a single server, and the upcoming version 3.0 (read the release candidate notes) will include built-in clustering for exactly that purpose.
When to use Redis
Unless you are working under constraints (e.g. a legacy application) that require the use of Memcached, or your use case matches one of the two scenarios above, you'll almost always want to use Redis instead. By using Redis as a cache, you gain a lot of power -- such as the ability to fine-tune cache contents and durability -- and greater efficiency overall.
Redis' superiority is evident in almost every aspect of cache management. Caches employ a mechanism called data eviction to delete old data from memory in order to make room for new data. Memcached's data eviction mechanism uses an LRU (Least Recently Used) algorithm and somewhat arbitrarily evicts data that's similar in size to the new data. Redis, by contrast, allows for fine-grained control over eviction though a choice of six different eviction policies. Redis also employs more sophisticated approaches to memory management and eviction candidate selection.
Redis gives you much greater flexibility regarding the objects you can cache. Whereas Memcached limits key names to 250 bytes, limits values to 1MB, and works only with plain strings, Redis allows key names and values to be as large as 512MB each, and they are binary safe. Redis has six data types that enable more intelligent caching and manipulation of cached data, opening up a world of possibilities to the application developer.
Instead of storing objects as serialized strings, the developer can use a Redis Hash to store an object's fields and values and manage them using a single key. Redis Hash saves developers the need to fetch the entire string, de-serialize it, update a value, re-serialize the object, and replace the entire string in the cache with its new value for every trivial update -- and that means lower resource consumption and increased performance. Other data types that Redis offers, such as Lists and Sets, can be leveraged to implement even more complex cache management patterns.
Another important advantage of Redis is that the data it stores isn't opaque, meaning that the server can manipulate it directly. A considerable share of the 160-plus commands available in Redis is devoted to data processing operations and embedding logic in the data store itself via server-side scripting. These built-in commands and user scripts give you the flexibility of handling data processing tasks directly in Redis, without having to ship data across the network to another system for processing.
Redis offers optional and tunable data persistence, which is designed to bootstrap the cache after a planned shutdown or an unplanned failure. While we tend to regard the data in caches as volatile and transient, persisting data to disk can be quite valuable in caching scenarios. Having the cache's data available for loading immediately after restart allows for much shorter cache warm-up periods and removes the load involved in repopulating and recalculating cache contents from the primary data store.
Last but not least, Redis offers replication. Replication can be used for implementing a highly available cache setup that can withstand failures and provide uninterrupted service to the application. Considering a cache failure falls only slightly short of application failure in terms of the impact on user experience and application performance, having a proven solution that guarantees the cache's contents and service availability is a major advantage in most cases.
Open source software continues to provide some of the best technologies available today. When it comes to boosting application performance through caching, Redis and Memcached are established and production-proven solutions and the natural candidates for the job. However, given its richer functionality and more advanced design, Redis should be your first choice in all but a few scenarios.
Itamar Haber (@itamarhaber) is chief developer advocate at Redis Labs, which offers Memcached and Redis as fully managed cloud services for developers. His varied experience includes software product development and management and leadership roles at Xeround, Etagon, Amicada, and M.N.S Ltd. Itamar holds a Master of Business Administration from the joint Kellogg-Recanati program by Northwestern and Tel-Aviv Universities, as well as a Bachelor of Science in Computer Science.
This story, "Why Redis beats Memcached for caching" was originally published by InfoWorld.