Lightning fast NoSQL with Spring Data Redis

6 uses cases for Redis in server-side Java applications

Page 2 of 2

Listing 4. Global pessimistic locking


public String aquirePessimisticLockWithTimeout(String lockName,
			int acquireTimeout, int lockTimeout) {
		if (StringUtils.isBlank(lockName) || lockTimeout <= 0)
			return null;
		final String lockKey = lockName;
		String identifier = UUID.randomUUID().toString(); 
		Calendar atoCal = Calendar.getInstance();
		atoCal.add(Calendar.SECOND, acquireTimeout);
		Date atoTime = atoCal.getTime();

		while (true) {
			// try to acquire the lock
			if (redisTemplate.execute(new RedisCallback<Boolean>() {
				@Override
				public Boolean doInRedis(RedisConnection connection)
						throws DataAccessException {
					return connection.setNX(
redisTemplate.getStringSerializer().serialize(lockKey), redisTemplate.getStringSerializer().serialize(identifier));
				}
			})) { 	// successfully acquired the lock, set expiration of the lock
				redisTemplate.execute(new RedisCallback<Boolean>() {
					@Override
					public Boolean doInRedis(RedisConnection connection)
							throws DataAccessException {
						return connection.expire(redisTemplate
								.getStringSerializer().serialize(lockKey),
								lockTimeout);
					}
				});
				return identifier;
			} else { // fail to acquire the lock
				// set expiration of the lock in case ttl is not set yet.
				if (null == redisTemplate.execute(new RedisCallback<Long>() {
					@Override
					public Long doInRedis(RedisConnection connection)
							throws DataAccessException {
						return connection.ttl(redisTemplate
								.getStringSerializer().serialize(lockKey));
					}
				})) {
					// set expiration of the lock
					redisTemplate.execute(new RedisCallback<Boolean>() {
						@Override
						public Boolean doInRedis(RedisConnection connection)
								throws DataAccessException {
							return connection.expire(redisTemplate
								.getStringSerializer().serialize(lockKey),
									lockTimeout);
						}
					}); 
}
				if (acquireTimeout < 0) // no wait
					return null;
				else {
					try {
						Thread.sleep(100l); // wait 100 milliseconds before retry
					} catch (InterruptedException ex) {
					}
				}
				if (new Date().after(atoTime))
					break;
			}
		}
		return null;
	}

	public void releasePessimisticLockWithTimeout(String lockName, String identifier) {
		if (StringUtils.isBlank(lockName) || StringUtils.isBlank(identifier))
			return;
		final String lockKey = lockName;

		redisTemplate.execute(new RedisCallback<Void>() {
					@Override
					public Void doInRedis(RedisConnection connection)
							throws DataAccessException {
						byte[] ctn = connection.get(redisTemplate
								.getStringSerializer().serialize(lockKey));
						if(ctn!=null && identifier.equals(redisTemplate.getStringSerializer().deserialize(ctn)))
							connection.del(redisTemplate.getStringSerializer().serialize(lockKey));
						return null;
					}
				});
	}	 

With a relational database you risk the possibility that the lock will never be released, if the program creating the lock in the first place quits unexpectedly. Redis's EXPIRE setting ensures that the lock will be released under any circumstances.

3. Bit Mask

Hypothetically a web client needs to poll a web server for client-specific updates against many tables in a database. Blindly querying all these tables for possible updates is costly. To get around this, try saving one integer per client in Redis as a dirty indicator, of which every bit represents one table. A bit is set when there are updates for the client in that table. During polling, no query will be fired on a table unless the corresponding bit is set. Redis is highly efficient in getting and setting such a bit mask as STRING.

4. Leaderboard

Redis's ZSET data structure offers a neat solution for game player leaderboards. ZSET works somewhat like PriorityQueue in Java, where objects are organized in a sorted data structure. Game players may be sorted in terms of their score in a leaderboard. Redis ZSET defines a rich list of commands supporting powerful and nimble queries. For example, ZRANGE (including ZREVRANGE) returns the specified range of elements in the sorted set.

You could use this command to list the top 100 players on a leaderboard. ZRANGEBYSCORE returns the elements within the specified score range (for instance by listing players with score between 1000 and 2000), ZRNK returns the rank of an element in the sorted set, and so forth.

5. Bloom filter

A Bloom filter is a space-efficient probabilistic data structure used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not. A query returns either "possibly in set" or "definitely not in set."

The bloom filter data structure has a wide variety of uses in both online and offline services, including big data analytics. Facebook uses bloom filters for typeahead searches, to fetch friends and friends of friends to a user-typed query. Apache HBase uses it to filter out disk reads of HFile blocks that don't contain a particular row or column, thus boosting the speed of reads. Bitly uses a bloom filter to avoid redirecting users to malicious websites, and Quora implemented a sharded bloom filter in the feed backend to filter out previously viewed stories. In my own project, I applied a bloom filter to track user votes on different subjects.

With its speed and throughput, Redis combines exceptionally well with a bloom filter. Searching GitHub turns up many Redis bloom filter projects, some of which support tunable precision.

6. Efficient global notifications: Publish/subscribe channels

A Redis publish/subscribe channel works like a fan-out messaging system, or a topic in JMS semantics. A difference between a JMS topic and a Redis pub/sub channel is that messages published through Redis are not durable. Once a message is pushed to all the connected clients, the message is removed from Redis. In other words, subscribers must stay online to accept new messages. Typical use cases for Redis pub/sub channels include realtime configuration distribution, simple chat server, etc.

In a web server cluster, each node can be a subscriber to a Redis pub/sub channel. A message published to the channel is pushed instantaneously to all the connected nodes. This message could be a configuration change or a global notification to all online users. Obviously this push communication model is extremely efficient compared with constant polling.

Performance optimizing Redis

Redis is extremely powerful, and it can be optimized both generally and for specific programming scenarios. Consider the following techniques.

Time-to-live

All Redis data structures have a time-to-live (TTL) attribute. When you set this attribute, the data structure will be removed automatically after it expires. Making good use of this feature will keep memory consumption low in Redis.

Pipelining

Sending multiple commands to Redis in a single request is called pipelining. This technique saves cost on network round-trips, which is important because network latency could be orders of magnitude higher than Redis latency. But there is a catch: the list of Redis commands inside a pipeline must be pre-determined and independent from each other. Pipelining doesn't work if one command's arguments are computed from the results of preceding commands. Listing 5 shows an example of Redis pipelining.

Listing 5. Pipelining


@Override
public List<LeaderboardEntry> fetchLeaderboard(String key, String... playerIds) {
	final List<LeaderboardEntry> entries = new ArrayList<>();
	redisTemplate.executePipelined(new RedisCallback<Object>() {	// enable Redis Pipeline
		@Override 
		public Object doInRedis(RedisConnection connection) throws DataAccessException { 
			for(String playerId : playerIds) {
				Long rank = connection.zRevRank(key.getBytes(), playerId.getBytes());
				Double score = connection.zScore(key.getBytes(), playerId.getBytes());
				LeaderboardEntry entry = new LeaderboardEntry(playerId, 
				score!=null?score.intValue():-1, rank!=null?rank.intValue():-1);
				entries.add(entry);
			}		 
			return null; 
		}
	}); 
	return entries; 
}

Replica set and sharding

Redis supports master-slave replica configuration. Like MongoDB, the replica set is asymmetric, as slave nodes are read-only to share read workloads. As I mentioned at the beginning of this article, it's also possible to implement sharding to scale out Redis throughput and memory capacity. In reality, Redis is so powerful that an internal Amazon benchmark reveals that one EC2 instance of type r3.4xlarge easily handles 100,000 requests per second. Some have informally reported 700,000 requests per second as a benchmark. For small-to-medium applications, you generally will not need to bother with sharding in Redis. (See the essential Redis in Action for more about performance optimization and sharding in Redis.)

Transactions in Redis

Although Redis doesn't support full ACID transaction like an RDBMS does, its own flavor of transaction is quite effective. In essence, a Redis transaction is a combination of pipelining, optimistic locking, commits, and rollbacks. The idea is to execute a list of commands in a pipeline, then watch for possible updates on a critical record (optimistic lock). Depending on whether or not the watched record is updated by another process, the list of commands will either commit as a whole or roll back entirely.

As an example, consider seller inventory in an auction website. When a buyer tries to buy an item from a seller, you watch for changes on the seller's inventory inside the Redis transaction. In the meantime, you remove the item from the same inventory. Before the transaction closes, if the inventory was touched by more than one process (for instance, if two buyers purchased the same item at the same moment), the transaction will roll back; otherwise, the transaction will commit. A retry can kick in after a rollback.

A transaction pitfall in Spring Data Redis

I learned a hard lesson when enabling Redis transactions in the Spring RedisTemplate class redisTemplate.setEnableTransactionSupport(true);: Redis started returning junk data after running for a few days, causing serious data corruption. A similar case was reported on StackOverflow.

By running a monitor command, my team discovered that after a Redis operation or RedisCallback, Spring doesn't close the Redis connection automatically, as it should do. Reusing an unclosed connection may return junk data from an unexpected key in Redis. Interestingly, this issue doesn't show up when transaction support is set to false in RedisTemplate.

We discovered that we could make Spring close Redis connections automatically by configuring a PlatformTransactionManager (such as DataSourceTransactionManager) in the Spring context, then using the @Transactional annotation to declare the scope of Redis transactions.

Based on this experience, we believe it's good practice to configure two separate RedisTemplates in the Spring context: One with transaction set to false is used on most Redis operations; the other with transaction enabled is only applied to Redis transactions. Of course PlatformTransactionManager and @Transactional must be declared to prevent junk values from being returned.

Moreover, we learned the downside of mixing a Redis transaction with a relational database transaction, in this case JDBC. Mixed transactions do not behave as you would expect.

Conclusion

With this article I've hoped to introduce other Java enterprise developers to the power of Redis, particularly when used as a remote data cache and for volatile data. I've introduced six effective uses cases for Redis, shared a few performance optimizing techniques, and explained how my team at Glu Mobile worked around getting junk data as a result of mis-configured transactions in Spring Data Redis. I hope that this article has piqued your curiosity about Redis NoSQL and offered some pathways for exploring it in your own Java EE systems.

| 1 2 Page 2