Docker 📅 2026-02-07

Docker Exit Code 137: How to Fix OOM Killed and Container Termination Errors

🚨 Symptoms & Diagnosis¶

Encountering Exit Code 137 for a Docker container is a critical signal, frequently indicating that the Linux kernel's Out-Of-Memory (OOM) killer has terminated your process. This often means your container demanded more memory than its allocated cgroup limit, or the host system was under severe memory pressure. Identifying this issue promptly is crucial for maintaining service reliability.

Observe for these error signatures in your logs or when inspecting container states:

docker: Error response from daemon: Container exited with code 137

{
  "Status": "exited",
  "OOMKilled": false,
  "ExitCode": 137
}

(Note: OOMKilled: false with ExitCode: 137 is a key indicator for non-direct OOM scenarios, often implying indirect host pressure or orchestrator enforcement.)

Inspect kernel logs via dmesg for explicit OOM killer events:

dmesg: Out of memory: Killed process 1234 (app) total-vm:XXXXkB, anon-rss:XXXXkB

Or a more generic indication from containerd-shim:

Killed process XXXX (containerd-shim) total-vm:XXXXkB, anon-rss:XXXXkB

Root Cause: Exit Code 137 typically signifies a SIGKILL signal (value 9) sent to a process, resulting in an exit code of 128 + 9 = 137. In Docker's context, this is predominantly triggered by the Linux kernel's OOM killer, enforcing cgroup memory limits, or due to severe host memory pressure. Application memory leaks can exacerbate these issues.

🛠️ Solutions¶

Addressing Docker Exit Code 137 requires both immediate mitigation strategies and robust, long-term engineering solutions to ensure application stability and optimal resource utilization in production environments.

Immediate Mitigation: Increase Container Memory Limit¶

This is a swift tactical fix to re-establish service availability by providing the container with more ephemeral memory resources. Ideal for immediate production restarts.

Immediate Mitigation: Increase Container Memory Limit

If your service is down, the fastest way to bring it back online is to temporarily allocate more memory. This buys you time to investigate the root cause properly.

Identify the failing container's image or existing exited container ID.
Stop any currently running or exited instances of the problematic container to avoid resource conflicts.
Re-run the container with an increased --memory limit and, optionally, --memory-swap for robustness.
Continuously monitor resource usage using docker stats to validate the new limits.

# Example: Stop any existing instances of 'your-image' that have exited
docker stop $(docker ps -q -f status=exited -f ancestor=your-image)

# Start a new container with increased memory (e.g., 2GB memory, 4GB total swap)
docker run -d --name your-app-instance --memory=2g --memory-swap=4g -p 8080:8080 your-image

# Monitor real-time resource usage of the newly started container
docker stats your-app-instance

Best Practice Fix: Docker Compose Memory Limits + Monitoring¶

For resilient deployments, integrate resource limits directly into your docker-compose.yml. This ensures consistent resource allocation across deployments and facilitates proper scaling and health management.

Best Practice Fix: Docker Compose Memory Limits + Monitoring

Implement declarative resource limits using docker-compose for predictable behavior, coupled with health checks and robust monitoring to prevent recurrence and enable proactive alerting.

Update your docker-compose.yml to specify explicit memory limits and reservations within the deploy.resources section.
Integrate a healthcheck to allow Docker to determine container readiness and a restart policy for automatic recovery.
Deploy your services using the updated docker-compose file.
Ensure you have robust resource monitoring and log aggregation configured for OOM events on your orchestration platform (e.g., Prometheus, Grafana, ELK stack).

version: '3.8'
services:
  app:
    image: your-app:latest
    deploy:
      resources:
        limits:
          memory: 2G # Hard limit: container will be OOM killed if it exceeds this
        reservations:
          memory: 1G # Soft reservation: ensures at least this much memory is available
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/health"] # Example health endpoint
      interval: 30s
      timeout: 10s
      retries: 3
    restart: unless-stopped # Always restart unless explicitly stopped
    ports:
      - "8080:8080" # Map host port to container port

# Deploy the updated services
DOCKER_CLIENT_TIMEOUT=120 docker-compose up -d

# Monitor container stats in real-time (useful for debugging during deployment)
watch -n 5 'docker stats --no-stream'

Debug Non-OOM SIGKILL (Exit 137, OOMKilled=false)¶

Sometimes, a container exits with code 137 but the OOMKilled flag is false. This indicates the SIGKILL signal was sent for reasons other than the direct kernel OOM killer targeting the container's cgroup. These scenarios often point to host-level memory pressure, orchestration system intervention, or other external forces.

Inspect the exited container's state to confirm OOMKilled: false.
Examine the host's kernel logs (dmesg, journalctl) for any OOM events, even those not directly naming your container process.
Investigate if the Docker daemon itself or an orchestrator (like Nomad or Kubernetes) has enforced a kill due to perceived resource violations or health checks.
Perform an in-depth audit of host system memory pressure and swap usage.

# Get the ID of the last exited container (adjust as needed for specific containers)
CONTAINER_ID=$(docker ps -aq -f status=exited --format '{{.ID}}' | head -n 1)

# Inspect the container's state, specifically looking for OOMKilled flag
docker inspect $CONTAINER_ID | jq '.[].State'

# Check kernel logs for recent OOM kills (system-wide)
dmesg | grep -i 'killed process' | tail -10

# Continuously tail system journal for OOM events, specifically related to the kernel
journalctl -k -f | grep -i oom

# Check overall host memory usage
free -h

# Check kernel memory commit limits (how much memory the kernel is willing to allocate)
cat /proc/meminfo | grep -i commit

🧩 Technical Context (Visualized)¶

Docker leverages Linux cgroups to manage and isolate container resources, including memory. When a container's memory usage crosses its defined cgroup limit, or the host system itself runs critically low on memory, the Linux kernel's Out-Of-Memory (OOM) killer is invoked. This process sends a SIGKILL (signal 9) directly to the offending process or one of its children, causing the container to terminate abruptly with Exit Code 137.

graph TD
    A[Docker Container Starts] --> B{Application Memory Usage};
    B --> C{Exceeds cgroup Memory Limit?};
    C -- Yes --> D[Linux Kernel OOM Killer Activates];
    C -- No --> E{Host System Memory Pressure?};
    E -- Yes --> D;
    E -- No --> F[Docker Daemon / Orchestrator Intervention?];
    F -- Yes --> G["Sends SIGKILL Signal (9)"];
    D --> G;
    G --> H[Container Process Receives SIGKILL];
    H --> I["Container Exits with Code 137 (128 + 9)"];

✅ Verification¶

After implementing a solution, verify its effectiveness by monitoring container status and system logs. These commands help confirm that your containers are stable and OOM events are no longer occurring.

# Check current resource usage for all running containers (no-stream for a snapshot)
docker stats $(docker ps -q) --no-stream

# Review recent kernel messages for OOM killer activity (should be absent or reduced)
dmesg | grep -i 'killed process' | tail -5

# Monitor Docker service logs for OOM or Exit 137 events in real-time
journalctl -u docker.service -f | grep -i 'oom\|137'

# Continuously monitor for exited containers and their exit codes
watch -n 10 'docker ps -f status=exited --format "{{.Names}} {{.State.ExitCode}}"'

📦 Prerequisites¶

To effectively diagnose and resolve Docker Exit Code 137 issues, ensure you have: * Docker Engine 20.10+ * docker-compose 2.0+ (if using Docker Compose) * journalctl access for system logs. * root or sudo privileges for dmesg to inspect kernel logs. * jq for efficient JSON parsing of docker inspect output.