ZK

Building Containers from Scratch (Part 3): Resource Limits with cgroups

Introduction

In Part 2, we added network isolation to our container using namespaces.

However, our container can still consume unlimited resources:

  • All available CPU
  • All available memory
  • All available I/O bandwidth

A misbehaving container could crash the entire host.

In this part, we introduce Control Groups (cgroups) — Linux's resource management system.

Understanding cgroups

cgroups (Control Groups) are a Linux kernel feature for:

  • Resource limiting — cap CPU, memory, I/O usage
  • Prioritization — some groups get more resources than others
  • Accounting — track resource consumption
  • Control — freeze/resume process groups

cgroups are organized hierarchically in a virtual filesystem at:

/sys/fs/cgroup

cgroup v2 Architecture

Modern Linux systems use cgroup v2 (unified hierarchy).

┌────────────────────────────────────────────────────────────┐
│              /sys/fs/cgroup (cgroup v2)                    │
│                                                            │
│  ┌──────────────────────────────────────────────────┐     │
│  │         my_container_limits/                     │     │
│  │                                                  │     │
│  │  ├─ memory.max          (limit: 900M)           │     │
│  │  ├─ memory.current      (usage: 245M)           │     │
│  │  ├─ cpu.max             (limit: 50%)            │     │
│  │  └─ cgroup.procs        (PIDs: 1147, 1148)      │     │
│  │                                                  │     │
│  └──────────────────┬───────────────────────────────┘     │
│                     │                                     │
│                     │  Kernel enforces limits             │
│                     ▼                                     │
│         ┌─────────────────────────┐                       │
│         │   Container Processes   │                       │
│         │   PID: 1147, 1148       │                       │
│         │                         │                       │
│         │   Memory: 245M / 900M   │                       │
│         │   CPU: 30% / 50%        │                       │
│         └─────────────────────────┘                       │
│                                                            │
└────────────────────────────────────────────────────────────┘

When the limit is reached:

  • CPU → process is throttled
  • Memory → process is killed (OOM)

Key cgroup Controllers

  • memory — limit RAM usage, trigger OOM killer
  • cpu — limit CPU time, set shares/quotas
  • io — limit disk I/O bandwidth
  • pids — limit number of processes

Limiting Container Memory

We will create a container that cannot use more than 900MB of RAM.

Step 1: Create a cgroup

sudo mkdir /sys/fs/cgroup/my_container_limits

Verify:

ls /sys/fs/cgroup/my_container_limits/

You should see controller files like:

memory.max
memory.current
cpu.max
pids.max
cgroup.procs
...

Step 2: Set a Memory Limit

echo "900M" | sudo tee /sys/fs/cgroup/my_container_limits/memory.max

Verify:

cat /sys/fs/cgroup/my_container_limits/memory.max

Output:

943718400

900MB = 900 × 1024 × 1024 = 943718400 bytes

Step 3: Start Container (Network + Filesystem Isolation)

# Create network namespace
sudo ip netns add container_net

# Create veth pair
sudo ip link add veth-host type veth peer name veth-container
sudo ip link set veth-container netns container_net

# Configure host side
sudo ip addr add 192.168.10.1/24 dev veth-host
sudo ip link set veth-host up

# Configure container side
sudo ip netns exec container_net ip link set lo up
sudo ip netns exec container_net ip addr add 192.168.10.2/24 dev veth-container
sudo ip netns exec container_net ip link set veth-container up

# Enter container
sudo ip netns exec container_net chroot my_container /bin/bash

Step 4: Get Container PID

In another terminal:

ps aux | grep "chroot my_container"

Example:

root  12847  0.0  0.0  12345  1234 pts/1  S+  14:32  0:00 bash

PID = 12847

Step 5: Add Process to cgroup

CPID=12847
echo ${CPID} | sudo tee /sys/fs/cgroup/my_container_limits/cgroup.procs

Now the container is memory-limited.

Step 6: Monitor Memory Usage

watch -n 1 cat /sys/fs/cgroup/my_container_limits/memory.current

Step 7: Test Memory Limit

Inside container:

for i in {1..10000000}; do echo "$i"; done

If memory exceeds 900MB, the kernel invokes the OOM killer and terminates processes inside the cgroup.

Limiting CPU Usage

Set CPU Limit (50% of one core)

# Format: $QUOTA $PERIOD
# 50000 out of 100000 microseconds = 50%
echo "50000 100000" | sudo tee /sys/fs/cgroup/my_container_limits/cpu.max

Now the container can use only 50% of one CPU core.

Test CPU Limit

Inside container:

while true; do :; done

On host:

top -p ${CPID}

CPU usage will cap at ~50%.

Limiting Number of Processes

echo "100" | sudo tee /sys/fs/cgroup/my_container_limits/pids.max

The container cannot spawn more than 100 processes.

What We Achieved

The container now has:

  • Filesystem isolation (chroot)
  • Network isolation (network namespaces)
  • Resource limits (cgroups)

This is the foundation behind Docker flags:

  • --memory
  • --cpus
  • --pids-limit

Cleanup

Exit container:

exit

Remove cgroup:

sudo rmdir /sys/fs/cgroup/my_container_limits

Delete network namespace:

sudo ip netns del container_net
sudo ip link del veth-host

Key Takeaways

  • cgroups enforce resource limits
  • Memory limits trigger OOM killer
  • CPU limits throttle (not kill)
  • Process limits prevent fork bombs
  • This is a core primitive behind real containers

Next in the Series

Part 4: Layered Filesystems with Overlay →