Capped collections
Capped collections are a special type of MongoDB collection that have a fixed size and support high-throughput operations. They automatically remove the oldest documents to make space for new ones when they reach their maximum size. Capped collections are ideal for use-cases like logging, caching, and real-time analytics where you need a FIFO (First-In, First-Out) data structure.
Characteristics of Capped Collections
Fixed Size: The size of the capped collection is predetermined. Once the size limit is reached, older documents are automatically removed.
Preserves Insertion Order: Documents are stored in the order they were inserted, which makes it easy to retrieve documents based on insertion order.
High Throughput: Capped collections are optimized for high-speed read and write operations.
No Updates That Increase Size: You can update documents in a capped collection, but updates that increase the document size are not allowed, as this would violate the fixed size constraint.
No Deletes: While you can't remove individual documents, you can still empty the entire collection or remove it.
Creating a Capped Collection
You can create a capped collection using the createCollection
method with the capped
and size
options:
db.createCollection("myCappedCollection", { capped: true, size: 100000 })
Here, size
is the maximum size in bytes for the capped collection.
Converting a Regular Collection to Capped
You can convert an existing collection to a capped collection using the convertToCapped
command:
db.runCommand({ convertToCapped: 'myCollection', size: 100000 })
Querying a Capped Collection
Querying a capped collection is the same as querying a regular collection. However, you can take advantage of the natural order in which documents are stored:
db.myCappedCollection.find().sort({ $natural: -1 })
This query retrieves the most recently inserted documents first.
Use Cases
Logging: Store log entries and automatically remove the oldest when the collection fills up.
Real-time Analytics: Use for real-time metrics where only the most recent data is relevant.
Caching: Store frequently accessed data up to a certain limit.
Considerations
No Indexes: By default, capped collections only have an index on the
_id
field. You can add additional indexes, but remember that indexes consume space, which is limited in a capped collection.No Sharding: Capped collections cannot be sharded, which means they are not suitable for horizontal scaling.