Problem Statement
What is the oplog and why is its size important? How do you determine the appropriate oplog size?
Explanation
The oplog, or operations log, is a special capped collection in the local database that records all write operations on the primary node. It stores operations in chronological order with timestamps. Secondary nodes replicate by reading the oplog and applying these operations to their data.
Oplog size is critical because it is a capped collection with a fixed maximum size. Once full, the oldest entries are overwritten to make room for new operations. If a secondary falls too far behind and the operations it needs have been overwritten, it cannot catch up through normal replication and must perform a full initial sync.
Determining the right oplog size depends on your write volume and how long secondaries might be offline. The oplog should be large enough to hold at least 24 to 48 hours of operations, giving you time to address issues before a secondary falls too far behind. For deployments with high write rates or where maintenance might require nodes to be offline for extended periods, you need a larger oplog.
You can calculate oplog size based on your write operations per hour. Monitor your oplog window, which is the time span of operations stored in the oplog. If the window is too short relative to your maintenance windows or replication lag, increase the oplog size. MongoDB allows resizing the oplog without downtime in replica sets.
Code Solution
SolutionRead Only
// Check current oplog size and window
use local
db.oplog.rs.stats(1024*1024) // Size in MB
// Check oplog time window
var first = db.oplog.rs.find().sort({$natural:1}).limit(1).next()
var last = db.oplog.rs.find().sort({$natural:-1}).limit(1).next()
var window = (last.ts.getTime() - first.ts.getTime()) / 1000 / 3600
print("Oplog window: " + window + " hours")
// Resize oplog (requires replSetResizeOplog command)
db.adminCommand({
replSetResizeOplog: 1,
size: 16000 // Size in MB
})