Skip to main content

MongoDB 101

MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database, MongoDB uses JSON-like documents with optional schemas. This makes it a flexible and scalable choice for modern applications.

Overview

Instead of the tables and rows found in relational databases, MongoDB stores data in BSON (Binary JSON) documents. These documents are grouped into collections. The document model maps to the objects in your application code, making it easy to work with data. MongoDB is well-suited for big data applications, content management systems, and mobile applications.

Installation

The installation process for MongoDB varies by operating system. Here are the general steps for a Debian-based Linux distribution like Ubuntu.

Ubuntu/Debian

  1. Import the public key used by the package management system:

    sudo apt-get install gnupg
    curl -fsSL https://pgp.mongodb.com/server-6.0.asc | \
    sudo gpg -o /usr/share/keyrings/mongodb-server-6.0.gpg \
    --dearmor
  2. Create a list file for MongoDB:

    echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-6.0.gpg ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/6.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-6.0.list
  3. Reload local package database:

    sudo apt-get update
  4. Install the MongoDB packages:

    sudo apt-get install -y mongodb-org
  5. Start and enable the MongoDB service:

    sudo systemctl start mongod
    sudo systemctl enable mongod

Basic Concepts and Commands

Connecting to MongoDB

To connect to the MongoDB server from the command line, use the mongosh shell:

mongosh

Common Commands

  • Show Databases:

    show dbs
  • Create or Switch to a Database:

    use myNewDatabase
  • Create a Collection and Insert a Document:

    db.myCollection.insertOne({ name: "John Doe", age: 30, city: "New York" })
  • Find Documents:

    db.myCollection.find()
  • Find a Specific Document:

    db.myCollection.find({ name: "John Doe" })
  • Update a Document:

    db.myCollection.updateOne({ name: "John Doe" }, { $set: { age: 31 } })
  • Delete a Document:

    db.myCollection.deleteOne({ name: "John Doe" })

Docker Installation

Running MongoDB in a Docker container is a great way to get started with MongoDB.

  1. Pull the MongoDB Image:

    docker pull mongo:latest
  2. Run the MongoDB Container:

    docker run --name some-mongo -d mongo:latest

    This command starts a new container named some-mongo and runs it in detached mode.

  3. Connect to the Container:

    docker exec -it some-mongo mongosh

Kubernetes Deployment

Deploying MongoDB on Kubernetes can be done using a StatefulSet to ensure that the data is persisted and the pods have stable identities. For production environments, using the MongoDB Enterprise Kubernetes Operator is recommended.

  1. Create a Headless Service:

    apiVersion: v1
    kind: Service
    metadata:
    name: mongo
    spec:
    selector:
    app: mongo
    ports:
    - port: 27017
    targetPort: 27017
    clusterIP: None
  2. Create a StatefulSet:

    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
    name: mongo
    spec:
    serviceName: "mongo"
    replicas: 3
    selector:
    matchLabels:
    app: mongo
    template:
    metadata:
    labels:
    app: mongo
    spec:
    containers:
    - name: mongo
    image: mongo:latest
    ports:
    - containerPort: 27017
    volumeMounts:
    - name: mongo-persistent-storage
    mountPath: /data/db
    volumeClaimTemplates:
    - metadata:
    name: mongo-persistent-storage
    spec:
    accessModes: ["ReadWriteOnce"]
    resources:
    requests:
    storage: 10Gi

Configuration and Tuning

  • Configuration File: The main configuration file for MongoDB is mongod.conf. The location of this file varies, but it is typically found in /etc/mongod.conf.
  • Key Configuration Options:
    • storage.wiredTiger.engineConfig.cacheSizeGB: This setting controls the amount of memory that WiredTiger (the default storage engine) uses for its cache. It should be set to 50-60% of the available RAM.
    • net.maxIncomingConnections: This setting controls the maximum number of concurrent connections.
  • Tuning:
    • Indexing: Create indexes on fields that are frequently queried to improve query performance.
    • Sharding: For very large datasets, you can use sharding to distribute the data across multiple servers.
    • Monitoring: Use MongoDB's built-in monitoring tools, such as mongostat and mongotop, or external monitoring solutions like MongoDB Atlas Monitoring or Prometheus with the mongodb_exporter.