Top MongoDB Interview Questions (2026)

Jay

February 22, 2026

Whether you're interviewing for a backend role or a dedicated database position, MongoDB questions test whether you understand document databases beyond the basics. Here are the questions that actually come up, with the answers interviewers want to hear.

Fundamental Questions

What is MongoDB and how does it differ from relational databases?

MongoDB is a document database that stores data in JSON-like documents (BSON format). Unlike relational databases that use tables with rows and columns, MongoDB uses collections of documents with flexible schemas.

Key differences:

Schema: Relational databases enforce a schema upfront; MongoDB documents can have different structures
Relationships: Relational databases use foreign keys and JOINs; MongoDB typically embeds related data or uses references
Scaling: Relational databases traditionally scale vertically; MongoDB was designed for horizontal scaling (sharding)
Transactions: Relational databases have had ACID transactions for decades; MongoDB added multi-document transactions in version 4.0

What is BSON?

BSON (Binary JSON) is the binary representation MongoDB uses to store documents. It extends JSON with additional data types:

ObjectId: 12-byte unique identifiers
Date: 64-bit integer milliseconds since Unix epoch
Binary data: For storing blobs
Decimal128: High-precision decimal numbers
Regular expressions

BSON is designed for fast encoding/decoding and efficient storage, while remaining JSON-compatible for human readability.

What is an ObjectId?

ObjectId is MongoDB's default primary key type, a 12-byte identifier that's roughly chronologically sortable.

Structure (12 bytes):

4 bytes: Unix timestamp
5 bytes: Random value (process-unique)
3 bytes: Incrementing counter

JAVASCRIPT

// Creating an ObjectId
const id = new ObjectId();
// ObjectId("65a1b2c3d4e5f6a7b8c9d0e1")

// Extracting timestamp
id.getTimestamp();
// ISODate("2024-01-13T12:00:00Z")

ObjectIds are generated client-side by default, enabling insert operations without a round-trip to the server.

Data Modeling Questions

When should you embed documents vs. reference them?

Embed when:

Data is accessed together (one-to-few relationship)
The embedded document doesn't need to be queried independently
The embedded data doesn't grow unboundedly
You want atomic updates on the parent

JAVASCRIPT

// Embedded: Address belongs to user, always accessed together
{
  name: "Alice",
  address: {
    street: "123 Main St",
    city: "Seattle"
  }
}

Reference when:

Many-to-many relationships
The referenced data is large or accessed independently
The relationship can grow without limit
Data normalization reduces duplication

JAVASCRIPT

// Referenced: Author has many books, books are accessed independently
// User document
{ _id: 1, name: "Alice" }

// Book documents
{ _id: 101, title: "Book 1", author_id: 1 }
{ _id: 102, title: "Book 2", author_id: 1 }

What is the document size limit?

16 MB per document. This is a hard limit in MongoDB.

If you need to store larger data (files, images), use GridFS, which splits files into chunks stored as separate documents.

How do you handle many-to-many relationships?

Option 1: Array of references (simpler, good for small arrays)

JAVASCRIPT

// Student has array of course IDs
{ _id: 1, name: "Alice", courses: [101, 102, 103] }

// Course has array of student IDs
{ _id: 101, name: "Math", students: [1, 2, 3] }

Option 2: Junction collection (better for large relationships)

JAVASCRIPT

// Enrollment collection
{ student_id: 1, course_id: 101, enrolled_at: ISODate(...) }
{ student_id: 1, course_id: 102, enrolled_at: ISODate(...) }

The junction collection approach is preferred when:

The relationship has its own attributes (like enrollment date)
The arrays would grow very large
You need to query the relationship itself

Indexing Questions

What types of indexes does MongoDB support?

Single field: Index on one field
Compound: Index on multiple fields (order matters!)
Multikey: Automatically created when indexing array fields
Text: Full-text search indexes
Geospatial: 2d and 2dsphere for location queries
Hashed: For hash-based sharding
TTL: Automatically delete documents after a time period

JAVASCRIPT

// Single field
db.users.createIndex({ email: 1 })

// Compound (order matters for query coverage)
db.orders.createIndex({ customer_id: 1, created_at: -1 })

// Text index
db.articles.createIndex({ title: "text", body: "text" })

// TTL index (expire after 24 hours)
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 86400 })

Explain compound index order and the ESR rule

For compound indexes, field order determines which queries the index can support efficiently. The ESR rule helps design effective indexes:

Equality: Fields with equality conditions (field: value) go first
Sort: Fields used in sort go next
Range: Fields with range conditions ($gt, $lt, $in) go last

Example: Query is { status: "active", age: { $gt: 21 } } sorted by created_at:

JAVASCRIPT

// Good: Follows ESR
db.users.createIndex({ status: 1, created_at: 1, age: 1 })

// Less optimal: Range before sort
db.users.createIndex({ status: 1, age: 1, created_at: 1 })

What is a covered query?

A covered query is one where all requested fields are in the index. MongoDB returns results directly from the index without reading documents.

JAVASCRIPT

// Index on { name: 1, email: 1 }

// Covered query - only returns indexed fields, excludes _id
db.users.find({ name: "Alice" }, { name: 1, email: 1, _id: 0 })

// Not covered - returns _id which isn't in the index
db.users.find({ name: "Alice" }, { name: 1, email: 1 })

Covered queries are significantly faster for read-heavy workloads.

Aggregation Questions

Explain the aggregation pipeline

The aggregation pipeline processes documents through stages, where each stage transforms the documents before passing to the next:

JAVASCRIPT

db.orders.aggregate([
  // Stage 1: Filter
  { $match: { status: "completed" } },

  // Stage 2: Group and calculate
  { $group: {
      _id: "$customer_id",
      total: { $sum: "$amount" },
      count: { $sum: 1 }
  }},

  // Stage 3: Sort
  { $sort: { total: -1 } },

  // Stage 4: Limit
  { $limit: 10 }
])

Common stages:

$match: Filter documents
$group: Group and aggregate
$project: Reshape documents
$sort: Sort results
$limit/$skip: Pagination
$lookup: Join with another collection
$unwind: Deconstruct arrays

How do you perform a JOIN in MongoDB?

Use the $lookup aggregation stage:

JAVASCRIPT

db.orders.aggregate([
  {
    $lookup: {
      from: "customers",           // Collection to join
      localField: "customer_id",    // Field in orders
      foreignField: "_id",          // Field in customers
      as: "customer"                // Output array field
    }
  },
  { $unwind: "$customer" }         // Convert array to object
])

This is equivalent to a LEFT OUTER JOIN. MongoDB JOINs are less efficient than relational databases. Design your schema to minimize their need.

Replication and Sharding Questions

What is a replica set?

A replica set is a group of MongoDB servers that maintain the same data:

Primary: Receives all writes
Secondaries: Replicate from primary, can serve reads
Arbiter: Votes in elections but holds no data

If the primary fails, secondaries elect a new primary automatically (usually within 10-12 seconds).

JAVASCRIPT

// Force reads from secondary
db.collection.find().readPref("secondary")

// Read from nearest member
db.collection.find().readPref("nearest")

What is sharding and when do you use it?

Sharding distributes data across multiple servers (shards) for horizontal scaling. Use it when:

Data exceeds single server storage capacity
Write throughput exceeds single server capability
You need to distribute reads across regions

Shard key selection is critical:

High cardinality (many unique values)
Even distribution (avoid hot spots)
Query isolation (queries should target specific shards)

JAVASCRIPT

// Shard a collection
sh.shardCollection("mydb.orders", { customer_id: "hashed" })

Bad shard key example: { created_at: 1 } causes all new data to go to one shard.

Performance Questions

How do you identify slow queries?

Database profiler:

JAVASCRIPT

// Enable profiling for queries over 100ms
db.setProfilingLevel(1, { slowms: 100 })

// View slow queries
db.system.profile.find().sort({ ts: -1 })

Explain plans:

JAVASCRIPT

db.users.find({ email: "test@example.com" }).explain("executionStats")
// Look for:
// - COLLSCAN (bad - full collection scan)
// - IXSCAN (good - using index)
// - totalDocsExamined vs nReturned ratio

MongoDB Atlas Performance Advisor (if using Atlas)

What are read and write concerns?

Write concern determines acknowledgment level for writes:

JAVASCRIPT

// Wait for primary only (default)
{ w: 1 }

// Wait for majority of replica set
{ w: "majority" }

// Wait for write to journal
{ j: true }

Read concern determines data consistency for reads:

JAVASCRIPT

// May return data that could be rolled back
{ readConcern: "local" }

// Only returns data acknowledged by majority
{ readConcern: "majority" }

// For reading your own writes in a session
{ readConcern: "snapshot" }

Practical Scenario Questions

Design a schema for a social media feed

JAVASCRIPT

// Users collection
{
  _id: ObjectId,
  username: "alice",
  followers: [ObjectId, ObjectId, ...],  // Only if small
  followers_count: 1500
}

// Posts collection
{
  _id: ObjectId,
  author_id: ObjectId,
  content: "Hello world",
  created_at: ISODate,
  likes_count: 42,
  // Embed recent likes for display
  recent_likes: [
    { user_id: ObjectId, username: "bob" }
  ]
}

// Feed - fan-out on write or read depending on scale
{
  user_id: ObjectId,
  post_id: ObjectId,
  created_at: ISODate
}

Key considerations:

Embed data that's always accessed together
Use counters instead of array lengths for large counts
Fan-out on write for better read performance (at write cost)

How would you migrate from SQL to MongoDB?

Analyze access patterns: Document databases optimize for how you read, not how you store
Denormalize strategically: Embed data that's accessed together
Design for queries: Unlike SQL, you design the schema around queries, not entities
Handle relationships: Decide embed vs. reference for each relationship
Plan for scale: Consider sharding strategy early
Migrate incrementally: Run both systems in parallel during transition

Tips for the Interview

Understand when NOT to use MongoDB: Interviewers appreciate knowing trade-offs. MongoDB isn't ideal for complex transactions, heavy reporting, or when you need strict relational integrity.
Know the basics deeply: Most failures are on fundamental questions answered poorly, not advanced topics.
Speak to scale: MongoDB is often chosen for scalability. Understand sharding, replica sets, and how they affect your design decisions.
Practice explain plans: Being able to read an explain plan and identify problems shows practical experience.
Mention transactions carefully: Multi-document transactions exist but have overhead. Good candidates know when to design around them vs. when to use them.

Good luck with your interview!

Top MongoDB Interview Questions (2026)

Fundamental Questions

What is MongoDB and how does it differ from relational databases?

What is BSON?

What is an ObjectId?

Data Modeling Questions

When should you embed documents vs. reference them?

What is the document size limit?

How do you handle many-to-many relationships?

Indexing Questions

What types of indexes does MongoDB support?

Explain compound index order and the ESR rule

What is a covered query?

Aggregation Questions

Explain the aggregation pipeline

How do you perform a JOIN in MongoDB?

Replication and Sharding Questions

What is a replica set?

What is sharding and when do you use it?

Performance Questions

How do you identify slow queries?

What are read and write concerns?

Practical Scenario Questions

Design a schema for a social media feed

How would you migrate from SQL to MongoDB?

Tips for the Interview

Keep Reading

MongoDB vs MySQL: When to Use Each

DynamoDB vs MongoDB: Comparing NoSQL Giants

MongoDB vs PostgreSQL: A Detailed Comparison