Maximizing Database Performance with Write-Behind Caching
Next.js
Redis
MongoDB

Maximizing Database Performance with Write-Behind Caching

Enhancing database efficiency under heavy write loads using a write-behind caching strategy for improved scalability and performance.

Kokodi, a mobile clicker game with heavy database writes, faced severe performance issues when its user base reached 90,000 players. The game, running on a MongoDB database with two nodes (a primary and a replica, each with 32 vCPUs and 128GB RAM), struggled under the load of write operations. Despite batching writes by sending updates every two seconds instead of per click, MongoDB was still a bottleneck. With plans for a much larger player base, Kokodi needed a scalable solution. My task was to optimize database performance to handle future growth efficiently.

Identifying Performance Bottlenecks

The first step was to analyze slow queries and identify inefficiencies. One major issue was the absence of indexes on fields used in WHERE conditions during writes. Without proper indexing, MongoDB had to perform full document scans for updates, leading to high latency and unnecessary CPU usage. By strategically adding indexes to frequently queried fields, query response times significantly improved, and database load was reduced. However, even with optimized queries, the system still faced issues under high concurrent writes, indicating the need for a more robust solution.

Implementing Write-Behind Caching with Dragonfly

To further enhance performance, I implemented a write-behind caching system using Dragonfly, an open-source fork of Redis. Instead of writing directly to MongoDB on each request, user updates were stored in Dragonfly and periodically synchronized with the database. This approach drastically reduced the number of direct database writes while ensuring consistency across sessions.

Write-Behind Caching diagram
Write-Behind Caching diagram

Data Flow and Caching Strategy

When a user sends an update via the Next.js application, the request increments points and other relevant fields such as level and daily click limit in Dragonfly. The hash storing user data includes an updatesCount field, which tracks the number of modifications. If the user is not already in the cache, their data is loaded from MongoDB and stored in Dragonfly with a persistent TTL.

A separate Node.js application called wb-manager runs a cron job at fixed intervals to synchronize data. The manager retrieves all keys from Dragonfly and distributes them among multiple workers, leveraging available CPU cores. Each worker processes a batch of keys, ensuring that the number of keys per batch does not exceed a defined MAX_BATCH_SIZE. The workers fetch all user data stored in hashes and check whether lastUpdateCount matches updatesCount. If it does, the key is ignored since no changes have been made since the last sync. Otherwise, the worker batches all pending updates and sends a single bulk update to MongoDB.

To also improve Dragonfly's performance, I used pipelining to send multiple commands in a single request, reducing the number of round trips between the client and server. This optimization significantly improved the system's throughput and reduced latency, ensuring that the cache could handle high loads efficiently.

Efficient Synchronization and Memory Management

To manage cache expiration efficiently, a TTL (Time-To-Live) is set slightly higher than the sync interval. This ensures that unused keys expire automatically, preventing unnecessary memory usage. It also guarantees that if a key is updated during a sync, new updates are preserved for the next sync cycle. The combination of write-behind caching and controlled batching significantly reduces MongoDB’s write load while maintaining accurate and timely data updates.

The system ensures that at any point, if a failure occurs, data consistency is maintained by reloading state from MongoDB. Since Dragonfly acts as a temporary buffer rather than a permanent store, data loss is mitigated by relying on periodic synchronization. Additionally, Redis snapshots (RDB) create backups at intervals, and multiple Redis instances provide redundancy. MongoDB's write concern settings can also be adjusted to ensure durability across replicas.

Conclusion

With this optimized system, Kokodi has surpassed 600,000 users, and everything runs smoothly without overloading the database. The architecture ensures that MongoDB handles significantly higher loads without performance degradation, enabling seamless scalability. The write-behind approach has proven to be a robust and efficient solution for managing high-frequency writes in a real-time gaming environment.

By combining indexing optimizations with a carefully designed write-behind caching mechanism, the system effectively distributes load and ensures fast updates without stressing the primary database. This architecture allows Kokodi to continue growing, while maintaining a responsive and fluid gaming experience.