Recently, I have been working on some existing projects trying to implement graceful shutdown mechanism. The initial idea is to make application invokes deconstructor of each component as soon as the application receives specific signals such as
SIGINT. The idea works really well when I ran the application natively on my Macbook pro. However, by using
docker stop and
docker kill, it didn’t work as expected - To receives a signal and performs the corresponding cleanup tasks. So, what is exactly the Docker container shutdown process ?
When you run a docker container, by default it has a PID namespace, which means the docker process is isolated from other processes on your host. A typical PID namespace is a tree structure, and it starts from
PID1, which is also called
init in the Linux system. The PID namespace has an important task to reap zombie processes. So what is the correspondent of
PID1 in the docker container? Let’s see some scenarios according to yelp articles.
when we use
docker run, there are 2 forms:
RUN <command>(shell form)
RUN ["executable", "param1", "param2"](exec form)
- docker run (on the host machine)
/bin/bash(PID1, inside container)
python server.py(PID2, inside container)
/bin/bash as PID1 and runs your program as the subprocess. There is a problem with this approach - When a signal is sent to a shell, the signal actually won’t be forwarded to subprocesses. This pretty much makes our application broken. Consider situation that there are ongoing requests come to our server, and the data processing by server is still in the memory. If server is terminated without signal notification, tons of requests may fails and processing data might not be written back to the database.
- docker run (on the host machine)
python server.py(PID1, inside container)
By using the exec form, we can run our program as
PID1. This method is much better than Senario1 because we can directly handle the signal in the application. But if you use exec form to run a shell script to spawn your application, remember to use exec syscall to overwrite
/usr/bin/bash otherwise it will act as senario1.
Using exec form seems pretty good to us, but it leads to another problem, which is zombie process handling. Although the best practice is to create the program properly without generating zombie process. I often see zombie process generated in a program. It’s really hard to detect zombies process because zombies process may be generated by other frameworks or libraries. I need to mention In senario1,
/bin/bash can handle repeating zombie process. So inevitably, we need to think if there are other better solutions.
https://github.com/krallin/tini is a special project aiming to tackle this problem. According to README file, benefics of Using Tini are:
- It protects you from software that accidentally creates zombie processes, which can (over time!) starve your entire system for PIDs (and make it unusable).
- It ensures that the default signal handlers work for the software you run in your Docker image. For example, with Tini,
SIGTERMproperly terminates your process even if you didn’t explicitly install a signal handler for it.
We can simply run
tini as PID1 and it will forward the signal for subprocesses. Typically, tini is a signal proxy and it also can deal with zombie process issue automatically. After Docker 1.13 or greater version, you can run your program with tini by passing
--init flag to
docker run .
Worth to mention that the other similar project is dump-init by Yelp. A python package that can be installed from Pypi.
Let us take a look at 2 docker command related to shutdown container
when we use
docker stop, docker will wait for 10s for stopping container before killing a process (by default). The main process inside the container will receive
SIGTERM, then docker daemon will wait for 10s and send
SIGKILL to terminate process.
kill running containers immediately. it’s more like
kill -9 and
docker stop is what we rather use. It makes container perform a cleanup task after receiving
Knowing timeout is also important for us to implement gracefully shutdown. We need to set up a reasonable timeout for containers to clean up the task. The default time can be configured both on daemon and per containers.
shutdownTimeout : docker deamon
stopTimeout: docker container
When docker daemon receives
SIGTERM , it will send the
SIGTERM to all containers. The longest timeout will be applied.
- Use exec form to run your program
- Use exec in your shell script
- Realize what’s PID1 in your docker container
- Set up a reasonable timeout of docker daemon config
docker run --init