Back to Blog

Top MongoDB Interview Questions (2026)

JayJay

Whether you're interviewing for a backend role or a dedicated database position, MongoDB questions test whether you understand document databases beyond the basics. Here are the questions that actually come up, with the answers interviewers want to hear.

Fundamental Questions

What is MongoDB and how does it differ from relational databases?

MongoDB is a document database that stores data in JSON-like documents (BSON format). Unlike relational databases that use tables with rows and columns, MongoDB uses collections of documents with flexible schemas.

Key differences:

  • Schema: Relational databases enforce a schema upfront; MongoDB documents can have different structures
  • Relationships: Relational databases use foreign keys and JOINs; MongoDB typically embeds related data or uses references
  • Scaling: Relational databases traditionally scale vertically; MongoDB was designed for horizontal scaling (sharding)
  • Transactions: Relational databases have had ACID transactions for decades; MongoDB added multi-document transactions in version 4.0

What is BSON?

BSON (Binary JSON) is the binary representation MongoDB uses to store documents. It extends JSON with additional data types:

  • ObjectId: 12-byte unique identifiers
  • Date: 64-bit integer milliseconds since Unix epoch
  • Binary data: For storing blobs
  • Decimal128: High-precision decimal numbers
  • Regular expressions

BSON is designed for fast encoding/decoding and efficient storage, while remaining JSON-compatible for human readability.

What is an ObjectId?

ObjectId is MongoDB's default primary key type, a 12-byte identifier that's roughly chronologically sortable.

Structure (12 bytes):

  • 4 bytes: Unix timestamp
  • 5 bytes: Random value (process-unique)
  • 3 bytes: Incrementing counter
JAVASCRIPT
// Creating an ObjectId
const id = new ObjectId();
// ObjectId("65a1b2c3d4e5f6a7b8c9d0e1")

// Extracting timestamp
id.getTimestamp();
// ISODate("2024-01-13T12:00:00Z")

ObjectIds are generated client-side by default, enabling insert operations without a round-trip to the server.

Data Modeling Questions

When should you embed documents vs. reference them?

Embed when:

  • Data is accessed together (one-to-few relationship)
  • The embedded document doesn't need to be queried independently
  • The embedded data doesn't grow unboundedly
  • You want atomic updates on the parent
JAVASCRIPT
// Embedded: Address belongs to user, always accessed together
{
  name: "Alice",
  address: {
    street: "123 Main St",
    city: "Seattle"
  }
}

Reference when:

  • Many-to-many relationships
  • The referenced data is large or accessed independently
  • The relationship can grow without limit
  • Data normalization reduces duplication
JAVASCRIPT
// Referenced: Author has many books, books are accessed independently
// User document
{ _id: 1, name: "Alice" }

// Book documents
{ _id: 101, title: "Book 1", author_id: 1 }
{ _id: 102, title: "Book 2", author_id: 1 }

What is the document size limit?

16 MB per document. This is a hard limit in MongoDB.

If you need to store larger data (files, images), use GridFS, which splits files into chunks stored as separate documents.

How do you handle many-to-many relationships?

Option 1: Array of references (simpler, good for small arrays)

JAVASCRIPT
// Student has array of course IDs
{ _id: 1, name: "Alice", courses: [101, 102, 103] }

// Course has array of student IDs
{ _id: 101, name: "Math", students: [1, 2, 3] }

Option 2: Junction collection (better for large relationships)

JAVASCRIPT
// Enrollment collection
{ student_id: 1, course_id: 101, enrolled_at: ISODate(...) }
{ student_id: 1, course_id: 102, enrolled_at: ISODate(...) }

The junction collection approach is preferred when:

  • The relationship has its own attributes (like enrollment date)
  • The arrays would grow very large
  • You need to query the relationship itself

Indexing Questions

What types of indexes does MongoDB support?

  1. Single field: Index on one field
  2. Compound: Index on multiple fields (order matters!)
  3. Multikey: Automatically created when indexing array fields
  4. Text: Full-text search indexes
  5. Geospatial: 2d and 2dsphere for location queries
  6. Hashed: For hash-based sharding
  7. TTL: Automatically delete documents after a time period
JAVASCRIPT
// Single field
db.users.createIndex({ email: 1 })

// Compound (order matters for query coverage)
db.orders.createIndex({ customer_id: 1, created_at: -1 })

// Text index
db.articles.createIndex({ title: "text", body: "text" })

// TTL index (expire after 24 hours)
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 86400 })

Explain compound index order and the ESR rule

For compound indexes, field order determines which queries the index can support efficiently. The ESR rule helps design effective indexes:

  • Equality: Fields with equality conditions (field: value) go first
  • Sort: Fields used in sort go next
  • Range: Fields with range conditions ($gt, $lt, $in) go last

Example: Query is { status: "active", age: { $gt: 21 } } sorted by created_at:

JAVASCRIPT
// Good: Follows ESR
db.users.createIndex({ status: 1, created_at: 1, age: 1 })

// Less optimal: Range before sort
db.users.createIndex({ status: 1, age: 1, created_at: 1 })

What is a covered query?

A covered query is one where all requested fields are in the index. MongoDB returns results directly from the index without reading documents.

JAVASCRIPT
// Index on { name: 1, email: 1 }

// Covered query - only returns indexed fields, excludes _id
db.users.find({ name: "Alice" }, { name: 1, email: 1, _id: 0 })

// Not covered - returns _id which isn't in the index
db.users.find({ name: "Alice" }, { name: 1, email: 1 })

Covered queries are significantly faster for read-heavy workloads.

Aggregation Questions

Explain the aggregation pipeline

The aggregation pipeline processes documents through stages, where each stage transforms the documents before passing to the next:

JAVASCRIPT
db.orders.aggregate([
  // Stage 1: Filter
  { $match: { status: "completed" } },

  // Stage 2: Group and calculate
  { $group: {
      _id: "$customer_id",
      total: { $sum: "$amount" },
      count: { $sum: 1 }
  }},

  // Stage 3: Sort
  { $sort: { total: -1 } },

  // Stage 4: Limit
  { $limit: 10 }
])

Common stages:

  • $match: Filter documents
  • $group: Group and aggregate
  • $project: Reshape documents
  • $sort: Sort results
  • $limit/$skip: Pagination
  • $lookup: Join with another collection
  • $unwind: Deconstruct arrays

How do you perform a JOIN in MongoDB?

Use the $lookup aggregation stage:

JAVASCRIPT
db.orders.aggregate([
  {
    $lookup: {
      from: "customers",           // Collection to join
      localField: "customer_id",    // Field in orders
      foreignField: "_id",          // Field in customers
      as: "customer"                // Output array field
    }
  },
  { $unwind: "$customer" }         // Convert array to object
])

This is equivalent to a LEFT OUTER JOIN. MongoDB JOINs are less efficient than relational databases. Design your schema to minimize their need.

Replication and Sharding Questions

What is a replica set?

A replica set is a group of MongoDB servers that maintain the same data:

  • Primary: Receives all writes
  • Secondaries: Replicate from primary, can serve reads
  • Arbiter: Votes in elections but holds no data

If the primary fails, secondaries elect a new primary automatically (usually within 10-12 seconds).

JAVASCRIPT
// Force reads from secondary
db.collection.find().readPref("secondary")

// Read from nearest member
db.collection.find().readPref("nearest")

What is sharding and when do you use it?

Sharding distributes data across multiple servers (shards) for horizontal scaling. Use it when:

  • Data exceeds single server storage capacity
  • Write throughput exceeds single server capability
  • You need to distribute reads across regions

Shard key selection is critical:

  • High cardinality (many unique values)
  • Even distribution (avoid hot spots)
  • Query isolation (queries should target specific shards)
JAVASCRIPT
// Shard a collection
sh.shardCollection("mydb.orders", { customer_id: "hashed" })

Bad shard key example: { created_at: 1 } causes all new data to go to one shard.

Performance Questions

How do you identify slow queries?

  1. Database profiler:
JAVASCRIPT
// Enable profiling for queries over 100ms
db.setProfilingLevel(1, { slowms: 100 })

// View slow queries
db.system.profile.find().sort({ ts: -1 })
  1. Explain plans:
JAVASCRIPT
db.users.find({ email: "test@example.com" }).explain("executionStats")
// Look for:
// - COLLSCAN (bad - full collection scan)
// - IXSCAN (good - using index)
// - totalDocsExamined vs nReturned ratio
  1. MongoDB Atlas Performance Advisor (if using Atlas)

What are read and write concerns?

Write concern determines acknowledgment level for writes:

JAVASCRIPT
// Wait for primary only (default)
{ w: 1 }

// Wait for majority of replica set
{ w: "majority" }

// Wait for write to journal
{ j: true }

Read concern determines data consistency for reads:

JAVASCRIPT
// May return data that could be rolled back
{ readConcern: "local" }

// Only returns data acknowledged by majority
{ readConcern: "majority" }

// For reading your own writes in a session
{ readConcern: "snapshot" }

Practical Scenario Questions

Design a schema for a social media feed

JAVASCRIPT
// Users collection
{
  _id: ObjectId,
  username: "alice",
  followers: [ObjectId, ObjectId, ...],  // Only if small
  followers_count: 1500
}

// Posts collection
{
  _id: ObjectId,
  author_id: ObjectId,
  content: "Hello world",
  created_at: ISODate,
  likes_count: 42,
  // Embed recent likes for display
  recent_likes: [
    { user_id: ObjectId, username: "bob" }
  ]
}

// Feed - fan-out on write or read depending on scale
{
  user_id: ObjectId,
  post_id: ObjectId,
  created_at: ISODate
}

Key considerations:

  • Embed data that's always accessed together
  • Use counters instead of array lengths for large counts
  • Fan-out on write for better read performance (at write cost)

How would you migrate from SQL to MongoDB?

  1. Analyze access patterns: Document databases optimize for how you read, not how you store
  2. Denormalize strategically: Embed data that's accessed together
  3. Design for queries: Unlike SQL, you design the schema around queries, not entities
  4. Handle relationships: Decide embed vs. reference for each relationship
  5. Plan for scale: Consider sharding strategy early
  6. Migrate incrementally: Run both systems in parallel during transition

Tips for the Interview

  1. Understand when NOT to use MongoDB: Interviewers appreciate knowing trade-offs. MongoDB isn't ideal for complex transactions, heavy reporting, or when you need strict relational integrity.

  2. Know the basics deeply: Most failures are on fundamental questions answered poorly, not advanced topics.

  3. Speak to scale: MongoDB is often chosen for scalability. Understand sharding, replica sets, and how they affect your design decisions.

  4. Practice explain plans: Being able to read an explain plan and identify problems shows practical experience.

  5. Mention transactions carefully: Multi-document transactions exist but have overhead. Good candidates know when to design around them vs. when to use them.

Good luck with your interview!

Keep Reading