MongoDB Basics
MongoDB is a document-oriented NoSQL database that stores data in flexible, JSON-like documents. Instead of tables and rows, you work with collections and documents — a model that maps naturally to objects in JavaScript, Python, and other application languages.
Why MongoDB?
| Relational (SQL) | MongoDB (Document) |
|---|---|
| Tables with fixed schema | Collections with flexible documents |
| JOIN across tables | Embed related data or reference by ID |
| Vertical scaling emphasis | Horizontal sharding built-in |
| SQL query language | MongoDB Query API / MQL |
MongoDB excels when your data structure evolves frequently, when you need to scale writes horizontally, or when documents map 1:1 to application objects.
BSON — Binary JSON
MongoDB stores documents in BSON (Binary JSON):
- Extends JSON with additional types:
Date,ObjectId,Binary,Decimal128,Regex - Binary encoding is faster to parse than text JSON
- Field order is preserved in documents
Example document (shown as JSON for readability):
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"name": "Alice Chen",
"email": "[email protected]",
"tags": ["developer", "mongodb"],
"address": {
"city": "Seattle",
"country": "US"
},
"createdAt": ISODate("2024-06-13T10:00:00Z"),
"balance": NumberDecimal("1250.50")
}
The _id field is the primary key — MongoDB auto-generates an ObjectId if you omit it.
Common BSON Types
| Type | Example | Notes |
|---|---|---|
| String | "hello" |
UTF-8 |
| Int32/Int64 | NumberInt(42) |
Use Int64 for large counters |
| Double | 3.14 |
Default for floating point |
| Decimal128 | NumberDecimal("19.99") |
Financial calculations |
| Date | ISODate("2024-01-01") |
UTC stored internally |
| ObjectId | ObjectId("...") |
12-byte unique identifier |
| Array | [1, 2, 3] |
Ordered, typed elements |
| Object | { a: 1 } |
Nested documents |
Architecture Hierarchy
MongoDB Server (mongod)
└── Database (e.g., "shop")
└── Collection (e.g., "products")
└── Document { ... }
└── Document { ... }
└── Collection (e.g., "orders")
└── Document { ... }
- Database — namespace container; use one database per application (typically)
- Collection — group of documents (like a table, but schema-flexible)
- Document — single record, max 16 MB
Connecting with mongosh
# Install and start MongoDB, then connect
mongosh
# Or connect to a remote instance
mongosh "mongodb://user:pass@host:27017/shop?authSource=admin"
# Connection string with replica set
mongosh "mongodb://host1,host2,host3/shop?replicaSet=rs0"
Essential Shell Commands
// List all databases
show dbs
// Switch to (or create) a database
use shop
// Insert a document — creates collection implicitly
db.products.insertOne({
name: "Wireless Mouse",
price: NumberDecimal("29.99"),
category: "electronics",
inStock: true,
createdAt: new Date()
})
// List collections in current database
show collections
// Query documents
db.products.find()
db.products.find({ category: "electronics" })
db.products.find({ price: { $lt: 50 } })
// Count documents
db.products.countDocuments({ inStock: true })
// Update
db.products.updateOne(
{ name: "Wireless Mouse" },
{ $set: { price: NumberDecimal("24.99") } }
)
// Delete
db.products.deleteOne({ name: "Wireless Mouse" })
CRUD Operations in Detail
Create
db.users.insertMany([
{ name: "Alice", role: "admin", scores: [95, 87, 92] },
{ name: "Bob", role: "editor", scores: [78, 82] },
{ name: "Carol", role: "viewer" }
])
Use ordered: false in insertMany to continue inserting after duplicate key errors.
Read with Projection and Sort
// Only return name and role, sort by name
db.users.find(
{ role: { $ne: "viewer" } },
{ name: 1, role: 1, _id: 0 }
).sort({ name: 1 }).limit(20)
Update Operators
// Add to array, increment field
db.users.updateOne(
{ name: "Alice" },
{
$push: { scores: 98 },
$inc: { loginCount: 1 },
$set: { lastLogin: new Date() }
}
)
Delete
db.users.deleteMany({ role: "viewer" })
// Returns { acknowledged: true, deletedCount: N }
Schema Design Patterns
Embed when data is accessed together and has a 1-to-few relationship:
// Order with embedded line items
{
orderId: "ORD-001",
customer: "Alice",
items: [
{ product: "Widget", qty: 2, price: NumberDecimal("19.99") },
{ product: "Gadget", qty: 1, price: NumberDecimal("29.99") }
],
total: NumberDecimal("69.97")
}
Reference when data is large, shared, or updated independently:
// Order references product by ID
{
orderId: "ORD-002",
items: [
{ productId: ObjectId("507f1f77bcf86cd799439011"), qty: 2 }
]
}
Schema Validation (Optional)
Enforce structure when needed without losing flexibility:
db.createCollection("products", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "price"],
properties: {
name: { bsonType: "string" },
price: { bsonType: ["decimal", "double", "int"] },
inStock: { bsonType: "bool" }
}
}
},
validationLevel: "moderate" // only validate inserts and updates
})
Indexes
Indexes speed up queries — create them on frequently filtered fields:
db.products.createIndex({ category: 1, price: -1 })
db.users.createIndex({ email: 1 }, { unique: true })
// View indexes
db.products.getIndexes()
// Explain query plan
db.products.find({ category: "electronics" }).explain("executionStats")
Look for totalDocsExamined close to nReturned — large gaps indicate missing or inefficient indexes.
MongoDB vs SQL Quick Reference
| SQL | MongoDB |
|---|---|
CREATE DATABASE |
use dbname (implicit) |
INSERT INTO |
insertOne() / insertMany() |
SELECT WHERE |
find({ filter }) |
UPDATE SET |
updateOne() with $set |
DELETE FROM |
deleteOne() / deleteMany() |
JOIN |
$lookup aggregation or embed |
Application Integration
Node.js with the official driver:
const { MongoClient } = require('mongodb');
async function main() {
const client = new MongoClient('mongodb://localhost:27017');
await client.connect();
const db = client.db('shop');
const products = db.collection('products');
const result = await products.find({ inStock: true }).toArray();
console.log(result);
await client.close();
}
main().catch(console.error);
Always use connection pooling in production — create one MongoClient instance and reuse it.
Common Mistakes
- Storing money as floating-point
double— useDecimal128 - Unbounded arrays in documents (comments, log entries) — cap or move to separate collection
- Querying without indexes on large collections
- Using
skip()for deep pagination instead of keyset pagination - Creating a new connection per request — exhausts file descriptors
Troubleshooting
// Check collection stats
db.products.stats()
// Validate document sizes
db.products.aggregate([
{ $project: { size: { $bsonSize: "$$ROOT" } } },
{ $group: { _id: null, avgSize: { $avg: "$size" }, maxSize: { $max: "$size" } } }
])
// Find slow queries (requires profiling level 1+)
db.setProfilingLevel(1, { slowms: 100 })
db.system.profile.find().sort({ ts: -1 }).limit(5)
What Comes Next
This track covers CRUD in depth, aggregation pipelines, indexing strategies, replication, sharding, and performance tuning. Document databases reward thoughtful schema design — invest time upfront to match your access patterns.