Problem Statement
Explain the difference between targeted queries and broadcast queries in a sharded cluster. How does this affect performance?
Explanation
In a sharded cluster, queries fall into two categories based on whether they include the shard key: targeted queries and broadcast queries. The difference significantly impacts performance.
Targeted queries include the shard key in the query filter. When mongos receives a targeted query, it can determine exactly which shard or shards contain the relevant data by checking the config servers. Mongos then routes the query only to those specific shards. For example, if you query for a specific user ID and user ID is your shard key, mongos routes to only the shard containing that user ID range. This is very efficient because only one shard needs to process the query.
Broadcast queries do not include the shard key in the filter. Without the shard key, mongos cannot determine which shards contain relevant data, so it must broadcast the query to all shards. Each shard processes the query independently and returns results to mongos, which then merges the results before returning them to the application. This is much slower because all shards must be queried, network traffic multiplies, and mongos must merge potentially large result sets.
The performance impact is substantial. Targeted queries scale linearly; adding more shards does not slow them down because each query still hits only specific shards. Broadcast queries get slower as you add shards because more shards must be queried. In a ten-shard cluster, a broadcast query does ten times the work of querying a single shard.
To optimize performance, design your shard key and queries so that common operations are targeted. Include the shard key in query filters whenever possible. For queries that cannot include the shard key, consider using covered indexes on each shard to make the broadcast queries more efficient. Monitor your query patterns using profiling and explain to identify broadcast queries that can be optimized.
Code Solution
SolutionRead Only
// Shard key: { userId: 1 }
// Targeted query - includes shard key
db.orders.find({ userId: 12345, status: "completed" })
// Mongos knows userId 12345 is on Shard 2
// Routes to Shard 2 only - FAST
// Broadcast query - no shard key
db.orders.find({ status: "completed" })
// Mongos doesn't know which shards have completed orders
// Queries all shards, merges results - SLOW
// Explain shows broadcast
db.orders.find({ status: "completed" }).explain()
// Shows: SHARD_MERGE stage (broadcast to all shards)
// Best practice: include shard key
db.orders.find({
userId: { $in: [123, 456, 789] }, // Shard key
status: "completed"
})
// Targeted to specific shards - FAST