Docker Workshop: Conceptual Q&A

This document captures fundamental questions and answers about Docker concepts, organized by topic.

Docker Architecture & Components

Q: What is the distinction between a socket and a port?

UNIX Socket (like /var/run/docker.sock):

File-based communication on the same machine
Lives in the filesystem (you can ls it!)
Fast, local-only (no network involved)
Used for inter-process communication (IPC)
Example: Docker CLI talks to Docker daemon via socket

Network Port (like port 80, 8080):

Number-based communication over network
Can be local (localhost:8080) or remote (server.com:80)
Uses TCP/IP protocol
Used for network communication
Example: Web browser talks to nginx via port 80

Analogy:

Socket = internal phone line between offices
Port = external phone number for customers

Q: Does a Docker CLI command issued from bash invoke code that runs within the Docker Desktop VM?

Yes! The flow is:

WSL Bash
  ↓
docker CLI (client binary in WSL)
  ↓ (via socket: /var/run/docker.sock)
Docker Desktop VM
  ↓
dockerd (Docker daemon)
  ↓
Executes command (creates container, builds image, etc.)

The docker command in bash is just a remote control - all the real work happens inside Docker Desktop's VM.

The Docker Engine API is EXPOSED BY the daemon:

docker CLI
  ↓ (uses Docker Engine API)
dockerd (daemon)
  ↓ (exposes Docker Engine API)
  ↓ (uses lower-level APIs)
containerd
  ↓
runc (creates actual containers)
  ↓
Linux kernel (namespaces, cgroups)

Q: Where do Docker images and containers actually exist?

On Windows/Mac with Docker Desktop:

Images and containers exist inside the Docker Desktop VM
Stored at /var/lib/docker/ within the VM
NOT on your Windows filesystem
Managed by Docker

What docker system df shows:

Images: Stored in VM (layered filesystem)
Containers: Writable layer on top of images, in VM
Local Volumes: Docker-managed, inside VM at /var/lib/docker/volumes/
Build Cache: Inside VM at /var/lib/docker/buildkit/

Exception: Bind mounts directly link to host folders.

Q: Is it accurate to say we don't think about "where" an image or container is located?

Yes! That's Docker's key design principle - abstraction.

As CLI users, we think in terms of:

Image names: ubuntu, nginx, python:3.9
Container names/IDs: my-web-server, abc123def456
Logical concepts: ports, volumes, networks

Docker handles the "where" for us - same commands work on Windows, Mac, Linux.

Exception: When you mount volumes (-v /host/path:/container/path), you DO specify host locations.

Images vs Containers

Q: What is the point of the Docker Image?

The Image is an intermediate template between Dockerfile and Container:

Dockerfile = Recipe (text instructions)
Image = Frozen meal (ready to cook, but not cooking yet)
Container = Hot meal on your plate (actively running)

Key benefits:

Build the Image once (slow)
Run many Containers from it (fast)
Share the Image - everyone gets identical environments
The Image remains unchanged when you run containers

Q: An Image is "a layered filesystem" - what does this mean?

Each Dockerfile instruction creates a layer:

FROM python:3.9        # Layer 1
RUN pip install flask  # Layer 2 (cached!)
COPY app.py .          # Layer 3

If you change only app.py and rebuild:

Layers 1 and 2 are reused from cache
Only Layer 3 is rebuilt
Result: Much faster development!

Containers vs VMs

Q: Isn't a Container just like a very bare-bones VM?

From user experience: Yes, it feels like a VM.

Technically: No, fundamentally different.

What makes it FEEL like a VM:

Own filesystem
Own process tree
Own network interface
Can install packages, run commands

What reveals it's NOT a VM:

Shared kernel: Container and host use the same kernel
No boot process: Starts in milliseconds
Can't load kernel modules: VMs can, containers can't
Resource sharing: No hardware emulation layer

The illusion: Docker creates namespaces that make it appear isolated:

PID namespace: Process 1 inside, but actually process 54321 on host
Mount namespace: Own / filesystem view
Network namespace: Own IP address

Analogy:

VM: Separate computer running inside your computer
Container: Clever trick making one process think it's alone

Networking

Q: How do we map ports from host to Docker VM network space?

Port Mapping (forwarding):

-p 8080:8080

Host port 8080 → Container port 8080

Traffic is forwarded/proxied between two separate network spaces.

The mapping chain:

Browser (Windows)
    ↓
localhost:8080 (Windows)
    ↓ (Docker Desktop forwards)
Docker VM port 8080
    ↓ (Docker daemon routes)
Container port 8080
    ↓
Flask app listening on 0.0.0.0:8080

Q: When prime-web makes outbound requests to port 5000 on prime-api, does it use its own port?

Yes! It uses an ephemeral (temporary) port.

How it works:

OS assigns a random high-numbered port (typically 32768-65535)
prime-web uses this as the source port
Connection: prime-web:54321 → prime-api:5000 (54321 is example)
prime-api responds back to: prime-api:5000 → prime-web:54321
After response, ephemeral port is released

Port Roles:

Listening ports (servers): Fixed (8080, 5000)
Ephemeral ports (clients): Random, temporary

Analogy:

Listening port = business phone number (fixed, published)
Ephemeral port = your cell phone when you call them (temporary)

Q: Where does the Docker network (like prime-net) actually exist?

The virtual network exists in the Linux kernel's network namespace managed by the Docker daemon.

On Windows + WSL2:

Docker Desktop runs a hidden Linux VM
The network exists inside that VM's kernel
Managed by Linux kernel networking features (bridge, iptables, network namespaces)

What it physically is:

A software bridge (like a virtual network switch)
Network namespace isolation
Routing rules in the kernel

You can inspect it:

docker network inspect prime-net

Key point: It's not a "place" you can navigate to - it's a kernel data structure that routes packets between containers.

Q: Are there other types of networks besides bridge networks?

Yes! Docker has 5 network types:

Bridge (default): Containers on same bridge can talk, isolated from others
Host: Container uses host's network directly (no isolation)
None: No network at all (completely isolated)
Overlay: Spans multiple Docker hosts (for Swarm/Kubernetes)
Macvlan: Container gets its own MAC address (appears as physical device)

Most common: Bridge networks for 99% of use cases.

Volumes and Data

Q: Is a bind mount an alias for a host folder usable within the Container?

Exactly right! Perfect way to think about it.

Bind mount:

docker run -v /home/rob/mydata:/app/data ubuntu

Host folder: /home/rob/mydata
Container sees it as: /app/data
Same folder, two names (alias)
Write to /app/data/file.txt inside → appears at /home/rob/mydata/file.txt instantly

It's like a symbolic link - the container path is just another name for the host folder.

Q: Do we perform an analogous operation in mapping volumes to volumes?

Yes, very similar concept with an important distinction:

Port Mapping: Data is copied/forwarded

Packet arrives at host:8080
Docker forwards it to container:8080
Two separate network stacks

Volume Mapping: Data is shared directly

File written in container at /app/data/file.txt
Immediately visible on host at /home/rob/data/file.txt
Same underlying filesystem

Analogy:

Port mapping = Mail forwarding (copy and send)
Volume mapping = Shared folder (same files, different doors)

Building and Dependencies

Q: What happens if you don't stipulate a version number in requirements.txt?

Without version numbers:

flask
torch

Gets the latest version available at build time.

The Problem:

Today: Builds with flask==3.1.0 - works!
6 months later: Rebuilds with flask==4.0.0 - breaking changes!
"But it worked before!"

Best Practice for Docker:

flask==3.0.0  # Lock version, reproducible

Trade-offs:

Approach	Pros	Cons
No versions	Always latest	Breaks unexpectedly
Exact versions	Reproducible	Miss security updates
`pip freeze`	Fully reproducible	Very rigid

For Docker: Exact versions are best - containers should be reproducible!

Machine Learning with Docker

Q: How many containers could run on a G4dn.xlarge?

G4dn.xlarge specs:

4 vCPUs, 16 GB RAM
1x NVIDIA T4 GPU (16 GB VRAM)

Capacity by workload:

Light containers (APIs): 20-40 containers
Medium containers (ML inference): 4-8 containers
GPU-accelerated ML: 1-2 containers (GPU can't be easily shared)

Calculation for ML:

Available RAM: 16 GB - 2 GB (OS) = 14 GB
Per container: 2 GB
Max: 14 / 2 = 7 containers

But GPU is the bottleneck - only 1 container can use GPU at a time typically.

Best practice: Run 1 GPU container + 5-10 CPU-only support containers.

Q: In ResNet-N, what does N indicate?

N indicates the number of layers in the network architecture, NOT categories.

ResNet-18: 18 layers deep
ResNet-50: 50 layers deep
ResNet-152: 152 layers deep

The number of categories is always 1000 (for ImageNet pre-trained models).

The depth (N) affects:

Model capacity
Accuracy (deeper usually = more accurate)
Speed (deeper = slower)
Parameters (deeper = more weights)

Q: What are "Residual connections"? Is this where "Res" comes from?

Yes! "Res" = Residual.

The Problem ResNet Solved: Before ResNet (2015), networks deeper than ~20 layers got worse, not better (vanishing gradients).

Residual Connections - The Solution:

Traditional network:

Input → Layer 1 → Layer 2 → Output

Each layer learns: Output = F(Input)

ResNet with skip connection:

Input ──────────────────┐
  │                     │
  └→ Layer 1 → Layer 2 ─┴→ Add → Output

Each block learns: Output = F(Input) + Input

The + Input is the skip connection (or residual connection).

Why This Works:

Learning the residual (difference) is easier
Skip connection provides "highway" for gradients
Even if F(x) learns poorly, you still have input passing through

Analogy:

Traditional: "Describe this person completely"
Residual: "Describe how this person differs from average" (easier!)

Q: Is it correct to ask "how many neurons per layer" in ResNet?

It's reasonable, but the answer varies wildly - and "neurons" means something different in CNNs.

Convolutional layers have:

Filters (channels): Number of feature maps
Spatial dimensions: Height × Width

ResNet-50 layer sizes (examples):

Input: 224×224×3 = ~150K pixels
Early: 112×112×64 = ~800K activations
Middle: 28×28×512 = ~400K activations
Late: 7×7×2048 = ~100K activations
Final: 2048 → 1000 (fully-connected layer)

Better question: "How many channels per layer?"

ResNet-50: 64 → 256 → 512 → 1024 → 2048 channels as you go deeper

Q: Is an input image resampled to a standard size? Is it RGB?

Yes to both!

Standard preprocessing for ResNet:

Convert to RGB: Grayscale → 3-channel RGB, RGBA → drop alpha
Resize to 256×256: Maintains aspect ratio
Center crop to 224×224: Standard ImageNet input size
Normalize: Using ImageNet dataset statistics

Result: Any input image → 224×224×3 RGB, normalized

This is why you can upload any size/format image and it works!

Docker in Practice

Q: How do I calculate how many containers I can run on a single host?

Key Resource Constraints:

1. Memory (usually the bottleneck):

Max containers = (Available RAM - OS overhead) / Memory per container

Example:

Host: 16 GB RAM
OS + Docker: 2 GB
Each container: 512 MB
Max: (16 - 2) / 0.5 = 28 containers

2. CPU:

Max containers = CPU cores × oversubscription / CPU per container

Rule of Thumb:

Light containers (nginx, APIs): 50-100+ per host
Medium containers (web apps): 10-30 per host
Heavy containers (databases, ML): 5-10 per host

Best practice: Leave 20-30% headroom for spikes.

Q: When do I need to know the Docker build number?

When you need the build number:

Troubleshooting: Reporting bugs or asking for help
Compatibility: Some features require specific versions
Production: Ensure consistent environments across teams
Security: Check if you have patched versions

For learning Docker basics, you rarely need it.

Workshop Pedagogy

Q: How should I explain Dockerfile → Image → Container to students?

Excellent explanation structure:

"There are three related, central nouns in the Docker ecosystem: Dockerfile, Image, and Container.

Dockerfile is the starting point of a recipe for the Container
Image is an intermediate template that does not execute code. It is a snapshot that includes everything needed to create and run the Container
Container will execute a task when it runs

The Image is built using docker build. The subsequent command docker run creates a Container from the Image and then runs it. The image remains unchanged so the process can be repeated.

An Image is actually a layered filesystem. If we change only the Python code and do a new build: Docker will rebuild just that layer, not the entire Image: Faster development."

Summary of Key Insights

Abstraction is key: Docker hides complexity (filesystem locations, networking details)
Images are templates: Build once, run many times
Containers are isolated processes: Not VMs, but feel like them
Networks enable communication: Container-to-container via names
Volumes persist data: Bind mounts for development, named volumes for production
Layers enable caching: Put stable dependencies first, changing code last
Reproducibility matters: Lock versions in requirements.txt

Additional Concepts

Q: What does `mkdir -p` do?

-p stands for "parents".

What it does:

Creates parent directories as needed
Doesn't error if directory already exists

mkdir -p ~/a/b/c/d    # Creates all intermediate directories
mkdir -p ~/dwdata     # Safe even if dwdata already exists

Without -p:

mkdir ~/a/b/c/d       # Fails if ~/a/b/c doesn't exist
mkdir ~/dwdata        # Errors if dwdata already exists

For workshops: mkdir -p is safer, but plain mkdir is simpler and teaches students to handle errors.

Q: What's the difference between Flask and nginx?

Flask is a Python web application framework for building web apps.

nginx is a production web server for serving web traffic.

Aspect	Flask	nginx
Purpose	Build web apps	Serve web traffic
Language	Python	C
Performance	~1000 req/sec	~50,000+ req/sec
Static files	Slow	Fast
Concurrency	Limited	Excellent
Use case	Development, APIs	Production serving

In production: nginx sits in front of Flask:

Browser → nginx:80 → Flask:5000

For workshops: Flask alone is perfect—students see the full stack in Python.

Q: Why does prime-checker use Flask if only prime-frontend serves HTML?

Both use Flask:

prime-checker (backend API): Flask exposes /check/<number> endpoint, returns JSON
prime-frontend (proxy + web): Flask serves HTML page AND forwards requests to prime-checker

Flask is used for both HTTP servers—one serves an API, the other serves web pages and proxies requests.

Q: Why did prime-checker have flask-cors installed?

It's not needed in the current setup.

CORS (Cross-Origin Resource Sharing) would be needed if the browser's JavaScript directly called prime-checker's API from a different origin. But since prime-frontend acts as a proxy, the browser only talks to prime-frontend (same origin). The backend call from prime-frontend to prime-checker is server-to-server, which doesn't trigger CORS restrictions.

Result: flask-cors was removed to reduce complexity.

Q: What does EXPOSE do in a Dockerfile?

EXPOSE is documentation only. It tells users "this container listens on port X" but doesn't actually open or publish the port.

EXPOSE 8080

The -p 8080:8080 flag does the actual port mapping regardless of EXPOSE.

EXPOSE is useful for:

Documentation (what ports does this app use?)
docker run -P (publish all exposed ports to random host ports)

For workshops: You could remove all EXPOSE lines and everything would still work with -p.

Q: Does EXPOSE refer to the Docker VM port or container port?

Container port. EXPOSE 8080 means "the application inside this container listens on port 8080."

It's about the container's internal network namespace, not the Docker VM or host.

Q: What does COPY do in a Dockerfile?

COPY copies files from the build context (host) to the image.

WORKDIR /app
COPY app.py .

Source: app.py in build context (the . from docker build .)
Destination: Current WORKDIR in the image (. means /app)

Result: app.py ends up at /app/app.py in the image.

Q: What's the difference between RUN and CMD in a Dockerfile?

RUN: Executes during docker build. Modifies the image. Creates a new layer.

CMD: Specifies what runs when container starts. Doesn't execute during build. Only one CMD per Dockerfile.

RUN pip install flask          # Happens at build time
CMD ["python", "app.py"]       # Happens at container start

Summary:

RUN = build the image
CMD = run the container

Q: What does `python -c` do?

Execute the following string as Python code, then exit.

python -c "print('hello')"

Runs the code without needing a .py file.

Used in ResNet Dockerfile to download model during build:

RUN python -c "from transformers import AutoModelForImageClassification; AutoModelForImageClassification.from_pretrained('microsoft/resnet-50')"

Q: What does the ResNet RUN command download?

The ~98MB download includes:

Model architecture definition (small, KB range)
Trained weights (the bulk, ~98MB)
Configuration files (small)

The weights ARE the model for inference purposes. There's no separate "model" file—the weights contain the learned parameters from training on ImageNet.

Q: Can I invoke Linux commands from Python REPL?

Not with !ls — that's IPython/Jupyter syntax, not standard Python REPL.

From Python REPL, use:

import os
os.system('ls')

import subprocess
subprocess.run(['ls'])

The ! shortcut only works in IPython/Jupyter notebooks.

Q: How do I exit the Python REPL in a container?

exit() or quit() or Ctrl+D will all exit the Python REPL and stop the container.

Q: What happens if there's no CMD in the Dockerfile?

The CMD from the base image is inherited.

For example, FROM python:3.11-slim has CMD ["python3"] in its Dockerfile. So docker run fu-image (without specifying bash) would start a Python REPL.

If there's no CMD anywhere in the chain: The container starts and immediately exits with nothing to do.

Q: What's the anatomy of the docker run command?

docker run [options] IMAGE [COMMAND]

Two expected arguments:

Image name or identifier (required)
Command (optional—uses CMD from Dockerfile if omitted)

Example:

docker run -it fu-image bash

-it: Interactive terminal flags
fu-image: Image name
bash: Command to run (overrides CMD)

The command is optional—if omitted, Docker uses the CMD from the Dockerfile.

Q: Does `bash` run "in perpetuity" when used as the command?

Not quite. bash doesn't run "in perpetuity"—it runs interactively because of the -it flags.

Without -it: bash would start and immediately exit (no stdin to read from).

With -it: bash stays alive by giving it a terminal to interact with.

When you type exit: bash terminates, and the container stops.

Q: How does `-p 8080:8080` map ports?

docker run -p 8080:8080 prime-frontend

Format: -p HOST_PORT:CONTAINER_PORT

Left (8080): Port on your host machine
Right (8080): Port inside container where Flask is listening

Traffic flow:

Browser connects to localhost:8080 (host)
Docker forwards to container port 8080
Flask app inside container receives request
Response flows back the same path

If different: -p 9090:8080 means host port 9090 maps to container port 8080.

Q: What are the default Docker networks?

Docker creates three default networks:

bridge: Default network. Containers can communicate by IP, but not by name. Used when you don't specify --network.
host: Container uses host's network directly. No isolation. Container ports are host ports (no -p mapping needed).
none: No networking. Completely isolated container.

Custom networks (like prime-net) provide DNS resolution—containers can reach each other by name.

Q: Is the bridge network as fast as custom networks?

Same speed—both are bridge networks under the hood.

To get IPs on default bridge:

docker inspect prime-api | grep IPAddress

Then hardcode that IP: http://172.17.0.2:5000

Bad idea because:

IPs change between runs
Requires manual lookup
Hardcoded IPs in code

Custom network with DNS (container names as hostnames) is simpler and more robust.

Q: When would a container use the host network?

Use cases:

Network performance critical (eliminates NAT overhead)
Need to bind to specific host interfaces
Network monitoring/scanning tools
Legacy apps that can't handle port mapping

Command:

docker run --network host my-image

Code inside container:

app.run(host='0.0.0.0', port=8080)

App binds to port 8080, immediately available on host at localhost:8080. No -p flag needed.

Downside: Port conflicts. If host already uses 8080, container fails.

Q: What does `bash -c` mean?

-c means "command" (execute the following string as a command).

bash -c "echo hello"

It's not "clobber"—though the name does sound incongruously violent!

Q: What is `/app` in containers?

/app is a convention for where you put application code in production containers.

In Dockerfiles:

WORKDIR /app
COPY app.py .

WORKDIR /app creates the directory and sets it as the current working directory. When the container starts, you're in /app by default.

It's not required—just a widely-adopted convention. You could use /code, /src, or anything else, but /app is the de facto standard.

This Q&A document captures the conceptual foundation of Docker from our workshop session.

FilesExpand file tree

DOCKER_CONCEPTS_QA.md

Latest commit

History

DOCKER_CONCEPTS_QA.md

File metadata and controls

Docker Workshop: Conceptual Q&A

Docker Architecture & Components

Q: What is the distinction between a socket and a port?

Q: Does a Docker CLI command issued from bash invoke code that runs within the Docker Desktop VM?

Q: Where do Docker images and containers actually exist?

Q: Is it accurate to say we don't think about "where" an image or container is located?

Images vs Containers

Q: What is the point of the Docker Image?

Q: An Image is "a layered filesystem" - what does this mean?

Containers vs VMs

Q: Isn't a Container just like a very bare-bones VM?

Networking

Q: How do we map ports from host to Docker VM network space?

Q: When prime-web makes outbound requests to port 5000 on prime-api, does it use its own port?

Q: Where does the Docker network (like prime-net) actually exist?

Q: Are there other types of networks besides bridge networks?

Volumes and Data

Q: Is a bind mount an alias for a host folder usable within the Container?

Q: Do we perform an analogous operation in mapping volumes to volumes?

Building and Dependencies

Q: What happens if you don't stipulate a version number in requirements.txt?

Machine Learning with Docker

Q: How many containers could run on a G4dn.xlarge?

Q: In ResNet-N, what does N indicate?

Q: What are "Residual connections"? Is this where "Res" comes from?

Q: Is it correct to ask "how many neurons per layer" in ResNet?

Q: Is an input image resampled to a standard size? Is it RGB?

Docker in Practice

Q: How do I calculate how many containers I can run on a single host?

Q: When do I need to know the Docker build number?

Workshop Pedagogy

Q: How should I explain Dockerfile → Image → Container to students?

Summary of Key Insights

Additional Concepts

Q: What does mkdir -p do?

Q: What's the difference between Flask and nginx?

Q: Why does prime-checker use Flask if only prime-frontend serves HTML?

Q: Why did prime-checker have flask-cors installed?

Q: What does EXPOSE do in a Dockerfile?

Q: Does EXPOSE refer to the Docker VM port or container port?

Q: What does COPY do in a Dockerfile?

Q: What's the difference between RUN and CMD in a Dockerfile?

Q: What does python -c do?

Q: What does the ResNet RUN command download?

Q: Can I invoke Linux commands from Python REPL?

Q: How do I exit the Python REPL in a container?

Q: What happens if there's no CMD in the Dockerfile?

Q: What's the anatomy of the docker run command?

Q: Does bash run "in perpetuity" when used as the command?

Q: How does -p 8080:8080 map ports?

Q: What are the default Docker networks?

Q: Is the bridge network as fast as custom networks?

Q: When would a container use the host network?

Q: What does bash -c mean?

Q: What is /app in containers?

Q: What does `mkdir -p` do?

Q: What does `python -c` do?

Q: Does `bash` run "in perpetuity" when used as the command?

Q: How does `-p 8080:8080` map ports?

Q: What does `bash -c` mean?

Q: What is `/app` in containers?