What factors should you consider when choosing a shard key? Explain the consequences of a poor shard key choice.

Question

Accepted Answer

Choosing a shard key is one of the most critical decisions in sharding because it cannot be changed after sharding without rebuilding the collection. A good shard key must satisfy three main criteria: high cardinality, even distribution, and query pattern alignment.

High cardinality means the shard key has many distinct values. This allows MongoDB to distribute data across many chunks and shards. Low cardinality keys, like a status field with only three values, can only create three chunks maximum, preventing effective distribution across many shards.

Even distribution means queries and inserts are spread across all shards, avoiding hotspots. Monotonically increasing keys like timestamps or auto-incrementing IDs are poor choices because all new writes go to the highest chunk on one shard. This creates a write hotspot that defeats the purpose of sharding.

Query pattern alignment means your most common queries should include the shard key. When queries include the shard key, mongos can route them directly to specific shards. Without the shard key in queries, mongos must broadcast to all shards, which is inefficient and slow.

Poor shard key choices lead to several problems. Uneven distribution causes some shards to fill up while others remain empty, wasting resources and limiting scalability. Hotspots concentrate all activity on one shard, creating bottlenecks. Scatter-gather queries that hit all shards are slow and resource-intensive.

A common strategy is using a compound shard key that combines a field with good distribution, like user ID, with a time-based field for efficient time-range queries. Another approach is using hashed shard keys for monotonically increasing values to ensure even distribution.

Master Interviews
Anywhere, Anytime

What factors should you consider when choosing a shard key? Explain the consequences of a poor shard key choice.

Problem Statement

Explanation

Code Solution

Practice Sets

Related Questions

What format does MongoDB use to store data internally?

What field serves as the primary key in MongoDB documents?

Which of the following is NOT a valid MongoDB data type?

What is the main advantage of MongoDB's schema-less design?

Which method would you use to insert multiple documents at once in MongoDB?

More from MongoDB/NoSQL

Master Interviews Anywhere, Anytime

What factors should you consider when choosing a shard key? Explain the consequences of a poor shard key choice.

Problem Statement

Explanation

Code Solution

Practice Sets

Related Questions

What format does MongoDB use to store data internally?

What field serves as the primary key in MongoDB documents?

Which of the following is NOT a valid MongoDB data type?

What is the main advantage of MongoDB's schema-less design?

Which method would you use to insert multiple documents at once in MongoDB?

More from MongoDB/NoSQL

Master Interviews
Anywhere, Anytime