Primary MongoDB Performance Considerations
Memory or working set (for latency): Ensure that frequently accessed data and indexes (the working set) fit in RAM. When the working set exceeds available memory and spills to disk, read latency increases significantly.
CPU and concurrency: Workloads with high write volumes, aggregations, or many concurrent connections require sufficient CPU capacity. CPU saturation typically manifests as increased tail latency. Make sure to have enough processing power for concurrency and computing.
Storage latency and IOPS: For disk-bound workloads, choose SSD Premium to achieve consistently low latency and high IOPS. It is important for write-heavy workloads, extensive indexes, and bursts.
Indexes and query patterns: Correct indexing and efficient query shapes prevent collection scans and reduce random Input/Output (I/O). Poor indexes are one of the most common causes of slow queries.
Replication settings and topology: Replication is primarily for availability and failover. It can improve read throughput if your application distributes reads to secondaries with appropriate read preference, but write durability settings, such as
w:majorityand cross-zone latency, can increase write latency.Network placement: Keep the application and database close together, such as in the same region or on a low-latency network. Network latency affects query response times and replication behavior.
Connection management: Use driver connection pooling and avoid excessive connections, which can increase CPU or memory overhead and latency variance.
Scaling approach: Scale CPU, RAM, and storage vertically first, then consider sharding when a single replica set is consistently at its limits. While it adds complexity, it enables scale-out.