Redis keys are binary safe, this means that you can use any binary sequence as key, from a string like “foo” to the content of a JPEG file. The empty string is also a valid key. The maximum allowed key size is 512 MB.
A few rules:
- very long keys are not a good idea
- very short keys are not a good idea
- try to stick with a schema
Key Expiration
Key expiration lets you set a timeout for a key, also known as “time to live”, or “TTL”. When the time to live elapses, the key is automatically destroyed.
Key expiration can be set both using seconds or milliseconds precision. However, the expire time resolution is always 1 millisecond. Information about expires are replicated and persisted on disk, the time virtually passes when your Redis server remains stopped (this means that Redis saves the date at which a key will expire).
Keys in a Redis database are stored in a hash table (the keyspace), and they point to the related data structure. We don’t save the TTL value in the key, because in those cases where the expiration is not required (especially when Redis is not used as a cache), there would be a memory overhead. Because of this, the TTL for volatile keys is stored in a secondary hash table, usually smaller than the main dictionary. This secondary hash table stores the pointer to the key, and the key points to the TTL value. The expiration mechanism works as follows:
- passive expiration, when a key is accessed, Redis checks if the key exists and if it is expired. If expired, the key is removed and nil returned. However, this is not enough, because there may exist keys that are no longer accessed by clients.
- active expiration, this happens by random sampling. Keys are not sorted by expiration time and the strategy to find a suitable candidate for expiration is sampling the secondary hash table.
The original approach, (before Redis 6) was simply to remove those keys sampled by an algorithm and with expired TTL. The problem with this approach is that as the loop of sampling and deleting the keys progresses, it reaches fewer keys having an expired TTL, which is resource-intensive and does not produce a relevant number of evictions, causing the random sampling to waste CPU cycles. Because of this, the algorithm would stop sampling when the good candidates for eviction fell below a configurable threshold, set to the default of 25%. Meaning that in the worst case, 25% of the keys may be logically expired but still using memory.
An improvement has been added to Redis 6 to reduce the amount of hidden memory, that is, the memory allocated for expired keys that the previous implementation would not deallocate. The improvement consists of the introduction of a radix tree in addition to the secondary hash. With the introduction of the radix tree, the information about expiration times of the samples is capitalized and achieves a reduction of hidden memory allocated. This solution helps drop the 25% hidden keys without big design changes and also reduces the amount of used memory.
Refer to the Redis configuration file to learn more about the configuration of this algorithm using the paramter active-expire-effort, which configures the tolerance to the expired keys still present in the system. As a general rule, when sizing a system in which keys have a TTL set, hidden memory must be taken into account and memory pressure tested accordingly.
why does my replica have a different number of keys its master instance?
If you use keys with limited time to live this is normal behavior. This is what happens:
- the primary generates an RDB file on the first synchronization with the replica
- the RDB file will not include keys already expired in the primary but which are still in memory
- these keys are still in memory of the Redis primary, even if logically expired. While these keys are not logically part of the dataset, they are accounted for in the INFO output and in the DBSIZE command
- when the replica reads the RDB file generated by the primary, this set of keys will not be loaded
Because of this, it’s common for users with many expired keys to see fewer keys in the replicas. However, logically, the primary and replica will have the same content.