In this post I'll discuss zero-downtime deployments using unicorn and supervisord. There's a lot more to zero-downtime deployments then just keeping your website available. Listen to Ruby Rogues Ep. 71 or search google for a broader discussion of the problems involved.
When running a web application in production you should strive for 100% reachability. Down-times are normally perceived as errors in your application; and rightfully so. If you deploy often your users might stop using your app because of 502er they encounter.
Since I like to use supervisord in my production setup the most widely used unicorn setup for zero-downtime deployments does not work out of the box.
Supervisord requires the unicorn process to not daemonize. Also sending
SIGUSR2 to unicorn causes the old master to die.
Since supervisord watches the old master this will cause it to consider the application as exited, even tho it's running with a new process id.
Finally Supervisor will try to restart the application, and fail to do so because all sockets are in use by the new unicorn master.
Luckily, there's an utility called unicornherder. Unicornherder does not daemonize itself and keeps an eye on the unicorn pid file to check if unicorn is still alive. All messages sent to unicornherder are forwarded to the unicorn process. If unicorn quits, unicornherder quits too.
So, in order to use
preload_app for zero-downtime deployments we need to install unicornherder.
# assuming you are running Ubuntu: $ sudo apt-get install python-dev $ pip install unicornherder $ which unicornherder # => /usr/local/bin/unicornherder
Unicornherder itself does not require an additional configuration file. All required arguments are passed to the command line.
Next we need to configure supervisord:
Supervisord watches unicornherder, and unicornherder starts unicorn as a daemon. So all we need to do is to properly start unicornherder and make sure it keeps running.
Here's a sample supervisord configuration file I generated using
[program:myapp-unicornherder-1] command=/home/webapp/.rvm/bin/app_bundle exec unicornherder -u unicorn -p tmp/pids/unicorn.pid -- -c config/unicorn.rb autostart=true autorestart=true stopsignal=QUIT stdout_logfile=/home/webapp/shared/log/unicornherder-1.log stderr_logfile=/home/webapp/shared/log/unicornherder-1.error.log user=webapp directory=/home/webapp/current environment=RAILS_ENV="production",APP_PATH="/home/webapp/current",SHARED_PATH="/home/webapp/shared",TEMP_PATH="/home/webapp/shared/tmp",PORT="8619" [group:myapp] programs=myapp-unicornherder-1
- unicornherder is passed the path to the unicorn pidfile using the -p flag
- supervisord will send the
QUITsignal to unicornherder if we want to stop unicorn.
- unicorn is executed in an RVM managed environment, and I'm using a RVM wrapper to load the correct ruby version and gemset.
- basic unicorn configuration settings are exported into the environment
The unicorn configuration follows:
worker_processes ((ENV['RAILS_ENV'] == 'development') ? 2 : 8) working_directory ENV["APP_PATH"] listen ENV["PORT"].to_i, :tcp_nopush => true timeout 30 pid (ENV["TEMP_PATH"] + "/pids/unicorn.pid") stderr_path ENV["SHARED_PATH"] + "/log/unicorn.stderr.log" stdout_path ENV["SHARED_PATH"] + "/log/unicorn.stdout.log" preload_app true before_fork do |server, worker| if defined?(ActiveRecord::Base) ActiveRecord::Base.connection.disconnect! end old_pid = ENV["TEMP_PATH"] + '/pids/unicorn.pid.oldbin' if File.exists?(old_pid) && server.pid != old_pid begin Process.kill("QUIT", File.read(old_pid).to_i) rescue Errno::ENOENT, Errno::ESRCH # someone else did our job for us end end end after_fork do |server, worker| if defined?(ActiveRecord::Base) ActiveRecord::Base.establish_connection end end
The important points here is that we close any connections to external resources as the master has no use for them; Also note that we kill the old master as soon as the preloading is done.
If we deploy using Mina, we can use the following configuration to perform a zero-downtime deploy:
desc "Deploys the current version to the server." task :deploy => :environment do deploy do # omitted to :launch do queue %[kill -s USR2 $(sudo supervisorctl status | grep unicornherder | cut -d' ' -f7 | cut -d',' -f1)] end end end
and starting, stopping of unicorn is handled with supervisord:
desc "stop the application" task :down do queue "sudo supervisorctl stop myapp:*" end desc "start the application" task :up do queue "sudo supervisorctl start myapp:*" end
Verify we got a zero-downtime deployment
Now it's time to verify our setup is actually working.
ab -c 2 -n 100 http://www.example.com/ while restarting our application should not result in ANY dropped connections. Note that this largly depends on how long your application needs to start up.
We could further amplify the effects by adding fake calls to
sleep in our application.rb.
Anyway, here it goes:
This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking example.com (be patient).....done Server Software: nginx/1.2.4 Server Hostname: example.com Server Port: 80 Document Path: / Document Length: 22527 bytes Concurrency Level: 2 Time taken for tests: 10.947 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Total transferred: 2319600 bytes HTML transferred: 2252700 bytes Requests per second: 9.13 [#/sec] (mean) Time per request: 218.949 [ms] (mean) Time per request: 109.475 [ms] (mean, across all concurrent requests) Transfer rate: 206.92 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 55 58 1.8 57 69 Processing: 137 160 27.4 145 263 Waiting: 69 82 21.0 74 148 Total: 193 218 27.4 204 320 Percentage of the requests served within a certain time (ms) 50% 204 66% 215 75% 242 80% 249 90% 265 95% 271 98% 274 99% 320 100% 320 (longest request)
This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking example.com (be patient).....done Server Software: nginx/1.2.4 Server Hostname: example.com Server Port: 80 Document Path: / Document Length: 22527 bytes Concurrency Level: 2 Time taken for tests: 10.584 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Total transferred: 2319600 bytes HTML transferred: 2252700 bytes Requests per second: 9.45 [#/sec] (mean) Time per request: 211.686 [ms] (mean) Time per request: 105.843 [ms] (mean, across all concurrent requests) Transfer rate: 214.02 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 55 58 1.5 58 65 Processing: 137 153 18.3 145 207 Waiting: 68 76 5.8 75 102 Total: 195 211 18.4 202 265 Percentage of the requests served within a certain time (ms) 50% 202 66% 204 75% 215 80% 219 90% 248 95% 251 98% 252 99% 265 100% 265 (longest request)
No failed requests. It works! And the response times with multiple restarts are only slightly worse. Great!
I hope this blog post helped clarifing how to use unicorn and supervisord together while using zero-downtime deployments of your app server to keep serving requests.
- unicorn requires unicornherder for zero-downtime deployments, if you are using supervisord
- unicorn spawns a second master when sent
SIGUSR2which means you'll be running twice as mean workers as you specified during restarts
That's it! Happy hacking!