TDM 40200: Project 11 — 2023

Motivation: Containers are everywhere and a very popular method of packaging an application with all of the requisite dependencies. In the previous series of projects you’ve built a web application. While right now it may be easy to share and run your application with another individual, as time goes on and packages are updated, this is less and less likely to be the case. Containerizing your application ensures that the application will have the proper versions of the proper packages available in the proper location to run.

Context: This is a first of a series of projects focused on containers. The end goal of this series is to solidify the concept of a container, and enable you to "containerize" the application you’ve spent the semester building. You will even get the opportunity to deploy your containerized application!

Scope: Python, containers, UNIX

Learning Objectives
  • Improve your mental model of what a container is and why it is useful.

  • Use UNIX tools to effectively create a container.

Make sure to read about, and use the template found here, and the important information about projects submissions here.

Questions

Question 1

The most popular containerization tool at the time of writing is likely Docker. Unfortunately, Docker is not available on Anvil, as it currently does not enable rootless container creation. In addition, for this first project, we want to mess around with some UNIX tools to essentially create a container — these tools also require superuser permissions. Therefore, this project will be completed completely from within a shell, using shell tools, on a virtual machine which you will launch.

We will essentially be running a container on a virtual machine from within a SLURM job on Anvil. Sounds a bit crazy, and it is, but it will provide you with the ability to work fearlessly and break things. Of course, if you do break things, you can easily reset!

First thing is first. Open up a terminal on Anvil. This could be from within Jupyter Lab, or via VS Code, or just from an ssh session from within your own terminal.

Next, to ensure that SLURM environment variables don’t alter or effect our SLURM job, run the following.

for i in $(env | awk -F= '/SLURM/ {print $1}'); do unset $i; done;

Next, let’s make a copy of a pre-made operating system image. This image has Alpine Linux and a few basic tools installed, including: nano, vim, emacs, and Docker.

cp /anvil/projects/tdm/apps/qemu/images/builder.qcow2 $SCRATCH

Next, we want to acquire enough resources (CPU and memory) to not have to worry about something not working. To do this we will use SLURM to launch a job with 4 cores and about 8GB of memory.

salloc -A cis220051 -p shared -n 4 -c 1 -t 04:00:00

Next, we need to make qemu available to our shell.

module load qemu

Next, let’s launch our virtual machine with about 8GB of memory and 4 cores.

qemu-system-x86_64 -vnc none,ipv4 -hda $SCRATCH/builder.qcow2 -m 8G -smp 4 -enable-kvm -net nic -net user,hostfwd=tcp::2200-:22 &

If for some reason you get an error or message saying that port 2200 is being used, no problem! Just change the number in the previous command from 2200 to the output of the following command.

module use /anvil/projects/tdm/opt/core
module load tdm
find_port # this will print a port number

Then, when you run the ssh command below, use the port number that was printed by find_port, instead of 2200.

Next, its time to connect to our virtual machine. We will use ssh to do this.

ssh -p 2200 tdm@localhost -o StrictHostKeyChecking=no

If the command fails, try waiting a minute and rerunning the command — it may take a minute for the virtual machine to boot up.

When prompted for a password, enter purdue. Your username is tdm and password is purdue.

Finally, now that you have a shell in your virtual machine, you can do anything you want! You have superuser permissions within your virtual machine! To run a command as the super user prepend doas to the command. For example, to list the files in the /root directory, you would run: doas ls /root — it may prompt you for a password, which is purdue.

If at any time you break something and don’t know how to fix it, you can "reset" everything by simply killing the virtual machine, removing $SCRATCH/builder.qcow2, and rerunning the commands above.

To kill the virtual machine.

# exit the virtual machine by typing "exit" then, on Anvil, run:
fg %1 # or fg 1 -- this will bring the process to the foreground
# finally, press CTRL+C to kill the process

For this question, submit a screenshot showing the output of hostname from within your virtual machine!

Items to submit
  • Code used to solve this problem.

  • Output from running the code.

Question 2

I would highly recommend watching www.youtube.com/watch?v=8fi7uSYlOdc. It is an excellent 40 minute video where the author essentially creates a container using golang. While you may not understand golang, she does a great job of explaining, and it will give you a good idea of what is going on.

Is the inspiration for this project. We are just translating it over using our own tools and resources.

First thing is first. Let’s get a root filesystem that will be the "base" of our container. Since our virtual machine is running Alpine Linux, it could be cool to have our container be based on a different operating system — let’s use Ubuntu.

From within your virtual machine, run the following.

From this point forward, when we ask you to run any command, please assume we mean from inside your virtual machine unless otherwise specified.

wget https://releases.ubuntu.com/20.04.6/ubuntu-20.04.6-live-server-amd64.iso

This will download the .iso file from Ubuntu. Next, we need to mount the .iso file so that we can access the files within it.

# create a directory to mount the iso file on
mkdir /home/tdm/ubuntu_mounted

# mount the iso file
doas modprobe loop
doas mount -t iso9660 ubuntu-20.04.6-live-server-amd64.iso ubuntu_mounted

Now, if you run ls -la /home/tdm/ubuntu_mounted, you should see a bunch of files and directories.

ls -la /home/tdm/ubuntu_mounted
total 83
dr-xr-xr-x    1 root     root          2048 Mar 14 18:01 .
drwxr-sr-x    4 tdm      tdm           4096 Mar 30 11:11 ..
dr-xr-xr-x    1 root     root          2048 Mar 14 18:01 .disk
dr-xr-xr-x    1 root     root          2048 Mar 14 18:01 EFI
dr-xr-xr-x    1 root     root          2048 Mar 14 18:01 boot
dr-xr-xr-x    1 root     root          2048 Mar 14 18:02 casper
dr-xr-xr-x    1 root     root          2048 Mar 14 18:01 dists
dr-xr-xr-x    1 root     root          2048 Mar 14 18:01 install
dr-xr-xr-x    1 root     root         34816 Mar 14 18:01 isolinux
-r--r--r--    1 root     root         27491 Mar 14 18:02 md5sum.txt
dr-xr-xr-x    1 root     root          2048 Mar 14 18:01 pool
dr-xr-xr-x    1 root     root          2048 Mar 14 18:01 preseed
lr-xr-xr-x    1 root     root             1 Mar 14 18:01 ubuntu -> .

We want the filesystem from this iso. The filesystem is inside the following file: /home/tdm/ubuntu_mounted/casper/filesystem.squashfs. We have to unarchive that file, but before we can do that we need to install a package.

doas apk add squashfs-tools

Now, we can unarchive the file.

mkdir /home/tdm/ubuntu_fs
cp /home/tdm/ubuntu_mounted/casper/filesystem.squashfs /home/tdm/ubuntu_fs
cd /home/tdm/ubuntu_fs
doas unsquashfs filesystem.squashfs
cd
doas mv /home/tdm/ubuntu_fs/squashfs-root /home/tdm/
rm -rf /home/tdm/ubuntu_fs/*
doas cp -r /home/tdm/squashfs-root/* /home/tdm/ubuntu_fs/

# cleanup
doas umount ubuntu_mounted
rmdir /home/tdm/ubuntu_mounted
doas rm ubuntu-20.04.6-live-server-amd64.iso
doas rm -rf /home/tdm/squashfs-root

Finally, inside /home/tdm/ubuntu_fs, you should see the root filesystem for Ubuntu.

ls -la /home/tdm/ubuntu_fs
total 72
drwxr-sr-x   18 tdm      tdm           4096 Mar 30 11:29 .
drwxr-sr-x    4 tdm      tdm           4096 Mar 30 11:32 ..
lrwxrwxrwx    1 tdm      tdm              7 Mar 30 11:29 bin -> usr/bin
drwxr-xr-x    2 tdm      tdm           4096 Mar 30 11:29 boot
drwxr-xr-x    5 tdm      tdm           4096 Mar 30 11:29 dev
drwxr-xr-x   95 tdm      tdm           4096 Mar 30 11:29 etc
drwxr-xr-x    2 tdm      tdm           4096 Mar 30 11:29 home
lrwxrwxrwx    1 tdm      tdm              7 Mar 30 11:29 lib -> usr/lib
lrwxrwxrwx    1 tdm      tdm              9 Mar 30 11:29 lib32 -> usr/lib32
lrwxrwxrwx    1 tdm      tdm              9 Mar 30 11:29 lib64 -> usr/lib64
lrwxrwxrwx    1 tdm      tdm             10 Mar 30 11:29 libx32 -> usr/libx32
drwxr-xr-x    2 tdm      tdm           4096 Mar 30 11:29 media
drwxr-xr-x    2 tdm      tdm           4096 Mar 30 11:29 mnt
drwxr-xr-x    2 tdm      tdm           4096 Mar 30 11:29 opt
drwxr-xr-x    2 tdm      tdm           4096 Mar 30 11:29 proc
drwx------    2 tdm      tdm           4096 Mar 30 11:29 root
drwxr-xr-x   11 tdm      tdm           4096 Mar 30 11:29 run
lrwxrwxrwx    1 tdm      tdm              8 Mar 30 11:29 sbin -> usr/sbin
drwxr-xr-x    6 tdm      tdm           4096 Mar 30 11:29 snap
drwxr-xr-x    2 tdm      tdm           4096 Mar 30 11:29 srv
drwxr-xr-x    2 tdm      tdm           4096 Mar 30 11:29 sys
drwxr-xr-t    2 tdm      tdm           4096 Mar 30 11:29 tmp
drwxr-xr-x   14 tdm      tdm           4096 Mar 30 11:29 usr
drwxr-xr-x   13 tdm      tdm           4096 Mar 30 11:29 var

Awesome! We are going to use this later!

For this question, please include a screenshot of the final "product" — the output of the ls -la command on the /home/tdm/ubuntu_fs directory.

Items to submit
  • Code used to solve this problem.

  • Output from running the code.

Question 3

As mentioned before, we are going to follow very closely to this excellent post. Therefore, the first tool we will be using is chroot (think "change root"). chroot is a command that allows you to change the root directory of the current process and its children.

Currently, our root filesystem (in Alpine Linux of Alpine Linux) is the following:

ls -la /
total 85
drwxr-xr-x   22 root     root          4096 Feb  8 09:06 .
drwxr-xr-x   22 root     root          4096 Feb  8 09:06 ..
drwxr-xr-x    2 root     root          4096 Mar 30 11:22 bin
drwxr-xr-x    3 root     root          1024 Feb  8 09:14 boot
drwxr-xr-x   13 root     root          3120 Mar 30 10:56 dev
drwxr-xr-x   35 root     root          4096 Mar 30 10:56 etc
drwxr-xr-x    4 root     root          4096 Mar 30 10:10 home
drwxr-xr-x   10 root     root          4096 Feb  8 09:14 lib
drwx------    2 root     root         16384 Feb  8 08:59 lost+found
drwxr-xr-x    5 root     root          4096 Feb  8 08:59 media
drwxr-xr-x    2 root     root          4096 Feb  8 08:59 mnt
drwxr-xr-x    3 root     root          4096 Feb  8 09:19 opt
dr-xr-xr-x  149 root     root             0 Mar 30 10:56 proc
drwx------    2 root     root          4096 Feb  8 09:09 root
drwxr-xr-x    8 root     root           440 Mar 30 11:11 run
drwxr-xr-x    2 root     root         12288 Feb  8 09:16 sbin
drwxr-xr-x    2 root     root          4096 Feb  8 08:59 srv
drwxr-xr-x    2 root     root          4096 Feb  8 09:06 swap
dr-xr-xr-x   13 root     root             0 Mar 30 10:56 sys
drwxrwxrwt    4 root     root            80 Mar 30 10:56 tmp
drwxr-xr-x    9 root     root          4096 Mar 30 10:17 usr
drwxr-xr-x   13 root     root          4096 Mar 30 10:17 var

We want to make it so that our root filesystem is the contents of our ubuntu_fs directory. To do this, we will use the chroot command.

doas chroot /home/tdm/ubuntu_fs /bin/bash

This will result in running the /bin/bash shell where the root filesystem is the contents of the /home/tdm/ubuntu_fs directory. You’ll have a bash shell inside this directory. As a result, for example, you could run commands only available in Ubuntu:

lsb_release -a

As you will be able to see, in this shell, the root filesystem is the contents of the /home/tdm/ubuntu_fs directory:

ls -la /
total 72
drwxr-sr-x 18 1001 1001 4096 Mar 30 16:29 .
drwxr-sr-x 18 1001 1001 4096 Mar 30 16:29 ..
lrwxrwxrwx  1 1001 1001    7 Mar 30 16:29 bin -> usr/bin
drwxr-xr-x  2 1001 1001 4096 Mar 30 16:29 boot
drwxr-xr-x  5 1001 1001 4096 Mar 30 16:29 dev
drwxr-xr-x 95 1001 1001 4096 Mar 30 16:29 etc
drwxr-xr-x  2 1001 1001 4096 Mar 30 16:29 home
lrwxrwxrwx  1 1001 1001    7 Mar 30 16:29 lib -> usr/lib
lrwxrwxrwx  1 1001 1001    9 Mar 30 16:29 lib32 -> usr/lib32
lrwxrwxrwx  1 1001 1001    9 Mar 30 16:29 lib64 -> usr/lib64
lrwxrwxrwx  1 1001 1001   10 Mar 30 16:29 libx32 -> usr/libx32
drwxr-xr-x  2 1001 1001 4096 Mar 30 16:29 media
drwxr-xr-x  2 1001 1001 4096 Mar 30 16:29 mnt
drwxr-xr-x  2 1001 1001 4096 Mar 30 16:29 opt
drwxr-xr-x  2 1001 1001 4096 Mar 30 16:29 proc
drwx------  2 1001 1001 4096 Mar 30 16:29 root
drwxr-xr-x 11 1001 1001 4096 Mar 30 16:29 run
lrwxrwxrwx  1 1001 1001    8 Mar 30 16:29 sbin -> usr/sbin
drwxr-xr-x  6 1001 1001 4096 Mar 30 16:29 snap
drwxr-xr-x  2 1001 1001 4096 Mar 30 16:29 srv
drwxr-xr-x  2 1001 1001 4096 Mar 30 16:29 sys
drwxr-xr-t  2 1001 1001 4096 Mar 30 16:38 tmp
drwxr-xr-x 14 1001 1001 4096 Mar 30 16:29 usr
drwxr-xr-x 13 1001 1001 4096 Mar 30 16:29 var

So, when in this shell, running ls -la is actually running /home/tdm/ubuntu_fs/usr/bin/ls -la. Very cool! This is pretty powerful already and may even feel kind of like a container! Let’s test out how isolated we are. Open another terminal and connect to your virtual machine from that terminal as well. This will involve first using ssh to connect to the backend where your SLURM job is running, and then using ssh to connect to your virtual machine from there.

ssh a240.anvil.rcac.purdue.edu # connect to the given backend -- in my case, it was a240 -- yours may be different!
ssh -p 2200 tdm@localhost -o StrictHostKeyChecking=no # connect to the virtual machine

Once you are connected to your virtual machine, run the following command:

top

Now, in your chroot "jail", run the following command:

mount -t proc proc /proc
ps aux | grep -i top

If done correctly, you likely saw output similar to the following.

1001      2617  0.0  0.0   1624   960 ?        S+   16:49   0:00 top
root      2622  0.0  0.0   3312   720 ?        S+   16:50   0:00 grep --color=auto -i top

We are inside our container, yet we can see the top command running on our VM. We are clearly not isolated enough! In fact, from within our "container" we could probably even kill the top process that is outside of our "container":

pkill top

# after running this from inside our "container" switch tabs and you'll find that the top process stopped running!

To fix this, we need to create a namespace.

Namespaces allow us to create restricted views of systems like the process tree, network interfaces, and mounts.

Creating namespace is super easy, just a single syscall with one argument, unshare. The unshare command line tool gives us a nice wrapper around this syscall and lets us setup namespaces manually. In this case, we will create a PID namespace for the shell, then execute the chroot like the last example.

— Eric Chiang
https://ericchiang.github.io/post/containers-from-scratch/

Let’s test this out. First, exit our "container" by running exit. If you properly exited, the following command will no longer work.

lsb_release -a

Now, let’s use unshare to create a new process or PID namespace.

doas unshare -p -f --mount-proc=/home/tdm/ubuntu_fs/proc chroot /home/tdm/ubuntu_fs /bin/bash

Upon success, you will now find that our shell /bin/bash seems to think it is process 1!

ps aux
output
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   4248  3428 ?        S    16:59   0:00 /bin/bash
root        12  0.0  0.0   5900  2764 ?        R+   17:00   0:00 ps aux

We are one step closer! For this question, include a series of screenshots showing your terminal input and output.

Items to submit
  • Code used to solve this problem.

  • Output from running the code.

Question 4

Finally, another key component of a container is limiting resources. Eric mentions that it doesn’t make a lot of sense to have isolated processes if they can still eat up all of the system CPU and memory and potentially even cause other processes from the host system to crash. This is where cgroups (control groups) come in.

Using cgroups we can limit the resources a process can use. For example, we could limit the CPUs or the memory of a process. That is exactly what we will do! Let’s start by restricting the cores our container can use.

On the virtual machine, outside of the container, run the following.

doas su # become the superuser/root

mkdir /sys/fs/cgroup/cpuset/tdm # create a directory for our cpuset cgroup

ps aux

The output of ps aux should look something like the following.

ps aux output
 2028 root      0:00 containerd --config /var/run/docker/containerd/containerd.toml --log-level info
 2300 root      0:00 /sbin/syslogd -t -n
 2329 root      0:00 /sbin/acpid -f
 2361 chrony    0:00 /usr/sbin/chronyd -f /etc/chrony/chrony.conf
 2388 root      0:00 /usr/sbin/crond -c /etc/crontabs -f
 2426 root      0:00 sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups
 2433 root      0:00 /sbin/getty 38400 tty1
 2434 root      0:00 /sbin/getty 38400 tty2
 2438 root      0:00 /sbin/getty 38400 tty3
 2442 root      0:00 /sbin/getty 38400 tty4
 2446 root      0:00 /sbin/getty 38400 tty5
 2450 root      0:00 /sbin/getty 38400 tty6
 2454 root      0:00 sshd: tdm [priv]
 2456 tdm       0:00 sshd: tdm@pts/0
 2457 tdm       0:00 -zsh
 2514 root      0:00 unshare -p -f --mount-proc=/home/tdm/ubuntu_fs/proc chroot /home/tdm/ubuntu_fs /bin/bash
 2515 root      0:00 /bin/bash
 2523 root      0:00 sshd: tdm [priv]
 2525 tdm       0:00 sshd: tdm@pts/1
 2526 tdm       0:00 -zsh
 2528 root      0:00 zsh
 2530 root      0:00 ps aux

Notice the line directly below the line with unshare -p -f …​ — this is the PID of the process we want to restrict! In this case, it is 2515.

echo 0 > /sys/fs/cgroup/cpuset/tdm/cpuset.mems
echo 0 > /sys/fs/cgroup/cpuset/tdm/cpuset.cpus
echo 2515 > /sys/fs/cgroup/cpuset/tdm/tasks

# this limits the task with PID 2515 to only use CPU 0

mkdir /sys/fs/cgroup/memory/tdm # create a directory for our memory cgroup

# in addition, lets disable swap
echo 0 > /sys/fs/cgroup/memory/tdm/memory.swappiness

# lets also limit the memory to 100 MB
echo 100000000 > /sys/fs/cgroup/memory/tdm/memory.limit_in_bytes

echo 2515 > /sys/fs/cgroup/memory/tdm/tasks

Let’s test out the memory cgroup by creating the following hungry.py Python script and running it from within our container.

hungry.py
x = bytearray(1024*1024*50)
print("Used 50")
y = bytearray(1024*1024*50)
print("Used 100")
z = bytearray(1024*1024*50)
print("Used 150")

Now, running python3 hungry.py from within our container should yield:

output
Used 50
Killed

Very cool! Now the process was killed because it exceeded the memory limit we set! Hopefully this project demonstrated that containers are easier than they may seem! Of course, these examples are not complete, and containers and various utilities provided by a tool like Docker are both more feature-rich and sound, however, we hope that this demystified things a little bit.

There will still be a variety of things that aren’t functioning the same way a true container would. For example, running hostname newhost in the container would also change the hostname of the VM. You could fix this by adding --uts to your unshare command. Again, this is just to show you that containers are really just a filesystem + some system calls to isolate the process.

For this question, like the previous questions, just include some screenshots of your terminals input and output that demonstrate you were able to see the expected results.

Items to submit
  • Code used to solve this problem.

  • Output from running the code.

Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted.

In addition, please review our submission guidelines before submitting your project.