Caching in Go - Part 1 - A Basic Implementation
Saturday, Nov 30, 2024 | 10 minute read
In many cases, a fully featured external cache library is overkill, and for most use cases, writing your own implementation is a better option. You will understand how it works, and will be able to extend it to your needs. This post will cover the basics of implementing a cache in Go, and provide a working template for you to use.
Introduction
When building a web application in Go, I encountered the need to implement a cache in order to improve performance. At first, I began looking at the available libraries and found several that more than fulfilled my needs. However, I found myself wondering what it would take to implement my own cache, one that has the bare minimum of features I needed, and that I could extend it easily in the future if needed. Most of the existing libraries had many features that I wouldn’t use, and I ultimately prefer to understand how things work before I use them. I started implementing my own cache as a learning exercise, and it ended with not only a working implementation that I decided to use for my own projects, but one that I am sharing with the Go community, with the hope that it will help others. Simply copy and paste the code into your project and extend it if needed.
The Requirements
- It should be easy to use and extend
- It should be thread safe
- It should be able to store any type of data
- It should be able to expire data and cleanup the cache
- It should be able to have a custom cleanup function
The Implementation
Even if you only need to cache data of a specific data type, I would recommend you implement the cache with generics, instead of hard coding the data type. There is no major performance penalty for using generics, and it will allow you to reuse the cache for other data types in the future, without having to change the code.
type Cache[K comparable, V any] struct {
mutex sync.RWMutex
cache map[K]*CacheItem[V]
cacheDuration time.Duration
cleanupInterval time.Duration
cleanupTicker *time.Ticker
cleanupChan chan bool
cleanupFunc func(V)
}
type CacheItem[V any] struct {
value V
lastAccessed time.Time
}
The Cache is essentially a map of CacheItems, where the key is the comparable K, and the value is CacheItem[any]. The cleanupFunc represents a function that takes in V and performs any necessary cleanup operations. This allows a custom cleanup function to be executed when an item is expired. This is useful for cleaning up resources, such as closing a database connection. Since the cache can be used for numerous data types, a custom cleanup function should be declared upon cache initialization instead of being hardcoded into the cache implementation.
Each cache item stores the value and the last time it was accessed. Whenever the value field is accessed, the lastAccessed field is updated.
Now that we have structs representing the cache and cache items, we can implement the cleanup loop. This loop will run in a separate goroutine, and it will utilize the cleanupTicker to perform cleanup operations at a regular interval. The cleanupTicker is initialized with the cleanupInterval duration.
func (c *Cache[K, V]) cleanupLoop() {
for {
select {
case <-c.cleanupTicker.C:
c.mutex.RLock()
defer c.mutex.RUnlock()
for k, v := range c.cache {
c.mutex.RUnlock()
if time.Since(v.lastAccessed) > c.cacheDuration {
c.mutex.Lock()
if c.cleanupFunc != nil {
c.cleanupFunc(v.value)
}
delete(c.cache, k)
c.mutex.Unlock()
}
c.mutex.RLock()
}
case <-c.cleanupChan:
return
}
}
}
The cleanup operation is performed by iterating over the cache and checking if the item has expired. If it has, the cleanupFunc is executed with the value, and the item is removed from the cache. We hold a read lock only when the loop iterates to the next element, thus ensuring the cache can be read while the loop is running. A lock is acquired before the cleanup function is executed and the removal of the element from the cache. It is subsequently released.
With the cleanupLoop function complete, we can now implement a function to initialize a cache and return it to the caller. This function will also start the cleanup loop.
func NewCache[K comparable, V any](cleanupInterval,
cacheDuration time.Duration,
cleanupFunc func(V)) *Cache[K, V] {
cache := &Cache[K, V]{
cache: make(map[K]*CacheItem[V]),
cacheDuration: cacheDuration,
cleanupInterval: cleanupInterval,
cleanupTicker: time.NewTicker(cleanupInterval),
cleanupChan: make(chan bool),
cleanupFunc: cleanupFunc,
}
go cache.cleanupLoop()
return cache
}
The NewCache function takes in the cleanupInterval, cacheDuration, and cleanupFunc as parameters. It creates a new cache with the specified parameters, and starts the cleanup loop in a separate goroutine.
We now have a working cache implementation. However, it is not very useful yet, as it does not have any functionality to store or retrieve data. Let’s implement functions to store, retrieve, and delete data from the cache while keeping in mind the requirement of thread safety.
func (c *Cache[K, V]) Get(key K) (V, bool) {
c.mutex.RLock()
defer c.mutex.RUnlock()
item, ok := c.cache[key]
if !ok {
var i V
return i, false
}
item.lastAccessed = time.Now()
return item.value, true
}
The Get function takes in a key of type K and returns a value of type V and a boolean indicating whether the key was found in the cache. It first acquires a read lock on the cache, then checks if the key exists in the cache, and frees the read lock. If the key exists, then it updates the last accessed time of the item and returns the value along with a boolean value of true to indicate that the key was found, otherwise it returns an empty value and boolean value of false to indicate that the key was not found. It is important to note that a lock is not aquired when updating the last accessed time of the item. This is because it is a pointer to the item within the cache, so we are not directly updating a key’s value in the cache by calling the Set function. Also, we do not care if it is updated by a separate goroutine as it would only be updating if the key was accessed again, so the difference in the two last accessed times is insignificant and we do not care if the current update is overwritten by another update. Since in this case locking is not beneficial, we avoid doing so and improving performance.
func (c *Cache[K, V]) Set(key K, value V) V {
c.mutex.Lock()
defer c.mutex.Unlock()
if item, ok := c.cache[key]; ok {
if cleanupFunc != nil {
cleanupFunc(item.value)
}
item.value = value
item.lastAccessed = time.Now()
} else {
c.cache[key] = &CacheItem[V]{
value: value,
lastAccessed: time.Now(),
}
}
return value
}
The Set function takes in a key of type K and a value of type V. It returns a value of type V. It first acquires a lock on the cache deferring the unlocking of the lock until the end of the function. It then checks if the key exists in the cache. If the key is already cached, then it executes the cleanupFunc , and updates both the value and last accessed time of the item. If the key does not exist, it creates a new cache item with the specified value and a last accessed time of the current time, and adds it to the cache.
func (c *Cache[K, V]) Delete(key K) {
c.mutex.Lock()
defer c.mutex.Unlock()
delete(c.cache, key)
}
The Delete function takes in a key of type K. It first acquires a lock on the cache deferring the unlocking of the lock until the end of the function. It then deletes the key from the cache.
func (c *Cache[K, V]) Close() {
c.mutex.Lock()
defer c.mutex.Unlock()
c.cleanupTicker.Stop()
close(c.cleanupChan)
for key, item := range c.cache {
if c.cleanupFunc != nil {
c.cleanupFunc(item.value)
}
delete(c.cache, key)
}
}
The Close function closes the cache by iterating over the cache and calling the cleanupFunc for each item in the cache. It then deletes all the items from the cache. It also stops the cleanup ticker and closes the cleanup channel. This ensures that the cleanup loop is stopped and no more cleanup operations are performed.
We now have a fully functional cache implementation.
In a followup post, I will demonstrate how to improve this cache implementation though sharding. This can improve the performance of the cache when operating on an environment with a large number of concurrent requests. A large volume of concurrent requests can cause lock contention, which can lead to performance issues. This can be mitgated by sharding.
Usage
Now that we have a working cache implementation, let’s see how we can use it in our application.
func main() {
cache := utils.NewCache[string, db.Database](time.Second*15,
time.Second*60,
func(d db.Database) {d.Close()})
cache.Set("db", db.New())
db := cache.Get("db")
}
In the example I have the key being a string and the value being of db.Database type that is an interface that represents a database connection. However, you can use any data types for the key and value, as long as the key is comparable. The cache is initialized with a cleanup interval of 15 seconds, a cache duration of 60 seconds, and a cleanup function that closes the database connection.
Then one can insert items into the cache by calling the Set function, and retrieve items from the cache by calling the Get function. The Delete function can be used to remove items from the cache manually if needed.
Conclusion
This is a basic implementation of a cache in Go that for most use cases will be sufficient, and can easily be extended to meet your needs as after this post you should have a good understanding of how the cache works. Feel free to copy and paste the code into your project and use it however you desire.
Full Code
|
|