Making RabbitMQ Recover from (a)Mnesia

In the company I work for we’re using RabbitMQ to offload non-timecritical processing of tasks. To be able to recover in case RabbitMQ goes down our queues are durable and all our messages are marked as persistent. We generally have a very low number of messages in flight at any moment in time. There’s just one queue with a decent amount of them: the “failed messages” dump.

The Problem

It so happens that after a botched update to the most recent version of RabbitMQ (3.5.3 at the time) our admins had to nuke the server and install it from scratch. They had made a backup of RabbitMQ’s Mnesia database and I was tasked to recover the messages from it.
This is the story of how I did it.

Since our RabbitMQ was configured to persist all the messages this should be generally possible. Surely I wouldn’t be the first one to attempt this. ?

Looking through the Internet it seems there’s no way of ex/importing a node’s configuration if it’s not running. I couldn’t find any documentation on how to import a Mnesia backup into a new node or extract data from it into a usable form. ?

The Idea

My idea was to setup a virtual machine (running Debian Wheezy) with RabbitMQ and then to somehow make it read/recover and run the broken server’s database.

In the following you’ll see the following placeholders:


      on Debian (see RabbitMQ’S file locations)

  • BROKEN_NODENAME the $RABBITMQ_NODENAME of the broken server we have backups from
  • BROKEN_HOST the hostname of said server

One more thing before we start: if I say “fix permissions” below I mean

sudo chown -R rabbitmq:rabbitmq $RABBITMQ_MNESIA_DIR

1st Try

My first try was to just copy the broken node’s Mnesia files to the VM’s $RABBITMQ_MNESIA_DIR failed. The files contained node names that RabbitMQ tried to reach but were unreachable from the VM.

Error description:
            "Mnesia could not connect to any nodes."},

So I tried to be a little bit more picky on what I copied.

First I had to reset $RABBITMQ_MNESIA_DIR by deleting it and have RabbitMQ recreate it. (I needed to do this way too many times ?)

sudo service rabbitmq-server stop
sudo service rabbitmq-server start

Stopping RabbitMQ I tried to feed it the broken server’s data in piecemeal fashion. This time I only copied the


  and restarted RabbitMQ.

RabbitMQ Management Interface lists all the queues, but the node it thinks they're on is "down"
RabbitMQ’s management interface lists all the queues, but it thinks the node they’re on is “down”

Looking at the web management interface there were all the queues we were missing, but they were “down” and clicking on them told you

The object you clicked on was not found; it may have been deleted on the server.

Copying any more data didn’t solve the issue. So this was a dead end. ?

2nd Try

So I thought why doesn’t the RabbitMQ in the VM pretend to be the exact same node as on the broken server?

So I created a




  in there.

I copied the backup to $RABBITMQ_MNESIA_DIR (now with the new node name) and fixed the permissions.

Now starting RabbitMQ failed with

ERROR: epmd error for host $BROKEN_HOST: nxdomain (non-existing domain)

I edited


  to add $BROKEN_HOST to the list of names that resolve to

Now restarting RabbitMQ failed with yet another error:

Error description:

Now what? Why don’t I try to give it the Mnesia files piece by piece again?

  • Stop RabbitMQ
  • Copy

      files in again and fix their permissions

  • Start RabbitMQ

All our queues were back and all their configuration seemed OK as well. But we still didn’t have our messages back yet.

RabbitMQ Data Recovery Screen Shot 2 - Node Up, Queues Empty
The queues have been restored, but they have no messages in them


So I tried to copy more and more files over from the backup repeating the above steps. I finally reached my goal after copying








. Fixing their permissions and starting RabbitMQ it had all the queues restored with all the messages in them. ?

RabbitMQ Data Recovery Screen Shot 3 - Messages Restored
Queues and messages restored

Now I could use ordinary methods to extract all the messages. Dumping all the messages and examining them they looked OK. Publishing the recovered messages to the new server I was pretty euphoric. ?

Bottle Plugin Lifecycle

If you use Python‘s Bottle micro-framework there’ll be a time where you’ll want to add custom plugins. To get a better feeling on what code gets executed when, I created a minimal Bottle app with a test plugin that logs what code gets executed. I uesed it to test both global and route-specific plugins.

When Python loads the module you’ll see that the plugins’




methods will be called immediately when they are installed on the app or applied to the route. This happens in the order they appear in the code. Then the app is started.

The first time a route is called Bottle executes the plugins’


methods. This happens in “reversed order” of installation (which makes sense for a nested callback chain). This means first the route-specific plugins get applied then the global ones. Their result is cached, i.e. only the inner/wrapped function is executed from here on out.

Then for every request the


method’s inner function is executed. This happens in the “original” order again.

Below you can see the code and example logs for two requests. You can also clone the Gist and do your own experiments.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import bottle
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)"module load")
class LifecycleTestPlugin(object):
name = "lifecycle_test"
api = 2
def __init__(self, name): = name"%s: plugin __init__",
def setup(self, app):"%s: plugin setup",
def apply(self, callback, context):"%s: plugin apply",
def _wrapper(*args, **kwargs):"%s: plugin apply wrapper",
return callback(*args, **kwargs)
return _wrapper
def close(self):"plugin close %s",
app = bottle.Bottle()"installing plugins ...")
def ping():
return 'pong'
if __name__ == "__main__":"start app ..."), host="", port="9000", reloader=True)
module load
installing plugins ...
app_plugin1: plugin __init__
app_plugin1: plugin setup
app_plugin2: plugin __init__
app_plugin2: plugin setup
route_plugin1: plugin __init__
route_plugin2: plugin __init__
start app ...
module load
installing plugins ...
app_plugin1: plugin __init__
app_plugin1: plugin setup
app_plugin2: plugin __init__
app_plugin2: plugin setup
route_plugin1: plugin __init__
route_plugin2: plugin __init__
start app ...
Bottle v0.12.8 server starting up (using WSGIRefServer())...
Listening on
Hit Ctrl-C to quit.
route_plugin2: plugin apply
route_plugin1: plugin apply
app_plugin2: plugin apply
app_plugin1: plugin apply
app_plugin1: plugin apply wrapper
app_plugin2: plugin apply wrapper
route_plugin1: plugin apply wrapper
route_plugin2: plugin apply wrapper - - [05/Jul/2015 14:07:25] "GET /ping HTTP/1.1" 200 4
app_plugin1: plugin apply wrapper
app_plugin2: plugin apply wrapper
route_plugin1: plugin apply wrapper
route_plugin2: plugin apply wrapper - - [05/Jul/2015 14:07:28] "GET /ping HTTP/1.1" 200 4
view raw output.log hosted with ❤ by GitHub

Android Backup and Restore with ADB

Updating my OnePlus One recently to Cyanogen OS 12 I had to reset my phone a few times before everything ran smoothly … so I wrote a pair of scripts to help me copy things around.

It uses the Android SDK’s ADB tool to do the copying since the Android File Transfer Tool for Mac has a laughable quality for Google’s standards.

Update 2018-11-22:
Since the scripts became more sophisticated I moved them to a proper project on GitHub.

Synchronize directories between computers using rsync (and SSH)

I found this command line magic gem some time ago and was using it ever since.

I started using it for synchronizing directories between computers on the same network. But it felt kind of clunky and cumbersome to get the slashes right so that it wouldn’t nest those directories and copy everything. Since both source and destination machine had the same basic directory layout, I thought ‘why not make it easier?’ … e.g. like this:

sync-to other-pc ~/Documents
sync-to other-pc ~/Music --exclude '*.wav'
sync-from other-pc ~/Music --dry-run --delete

It uses rsync for the heavy lifting but does the tedious source and destination mangling for you. 😀

You can find the code in this Gist.

#!/usr/bin/env python3
# Author: Riyad Preukschas <>
# License: Mozilla Public License 2.0
# Synchronize directories between computers using rsync (and SSH).
# Save this script as something like `sync-to` somewhere in $PATH.
# Link it to `sync-from` in the same location. (i.e. `ln sync-to sync-from`)
import os
import re
import shlex
import subprocess
import sys
PROGRAM_NAME = os.path.basename(sys.argv[0])
RSYNC = 'rsync'
'--rsh="ssh"', '--partial', '--progress', '--archive', '--human-readable']
RSYNC_EXCLUDE_PATTERNS = ['.DS_Store', '.localized']
# helpers
def print_usage_and_die():
print(re.sub(r'^[ ]{8}', '',
Synchronize directories between computers using rsync (and SSH).
Usage: {PROGRAM_NAME} HOST DIR [options]
HOST any host you'd use with SSH
DIR must be available on both the local and the remote machine
You can pass any options rsync accepts.
-v, --verbose will also print the command that'll be used to sync
sync-to other-pc ~/Documents
sync-to other-pc ~/Music --exclude '*.wav'
sync-from other-pc ~/Music --dry-run --delete
def is_verbose():
return any('^--verbose|-\w*v\w*$', arg) is not None
for arg
in sys.argv
# main
def main():
# parse options
if len(sys.argv) < 3:
host = sys.argv[1]
dir = sys.argv[2]
rsync_excludes = [f"--exclude='{pattern}'" for pattern in RSYNC_EXCLUDE_PATTERNS]
rsync_user_options = sys.argv[3:]
if'from$', PROGRAM_NAME):
rsync_src_dest = [f"{host}:{dir}/", dir]
elif'to$', PROGRAM_NAME):
rsync_src_dest = [f"{dir}/", f"{host}:{dir}"]
print('Error: unknown command')
# copy
exec_args = RSYNC_BASIC_OPTIONS + rsync_excludes + rsync_user_options + rsync_src_dest
if is_verbose():
print(f"{RSYNC} {' '.join(RSYNC_BASIC_OPTIONS + rsync_excludes)} {shlex.join(rsync_user_options + rsync_src_dest)}")
os.execvp(RSYNC, exec_args)[RSYNC] + exec_args)
if __name__ == '__main__':
view raw sync-to-from hosted with ❤ by GitHub


If you write software in Python you come to a point where you are testing a piece of code that expects a more or less elaborate dictionary as an argument to a function. As a good software developer we want that code properly tested but we want to use minimal fixtures to accomplish that.

So, I was looking for something that behaves like a dictionary, that you can give explicit return values for specific keys and that will give you some sort of a “default” return value when you try to access an “unknown” item (I don’t care what as long as there is no Exception raised (e.g.



My first thought was “why not use MagicMock?” … it’s a useful tool in so many situations.

from mock import MagicMock
m = MagicMock(foo="bar")

But using MagicMock where dict is expected yields unexpected results.

>>> # this works as expected
>>> # but this doesn't do what you'd expect
>>> m["foo"]
<MagicMock name='mock.__getitem__()' id='4396280016'>

First of all attribute and item access are treated differently. You setup MagicMock using key word arguments (i.e. “dict syntax”), but have to use attributes (i.e. “object syntax”) to access them.

Then I thought to yourself “why not mess with the magic methods?”




  expect the same arguments anyway. So this should work:

m = MagicMock(foo="bar")
m.__getitem__.side_effect = m.__getattr__

Well? …

>>> m["foo"]
<MagicMock name='' id='4554363920'>

… No!

By this time I thought “I can’t be the first to need this” and started searching in the docs and sure enough they provide an example for this case.

d = dict(foo="bar")

m = MagicMock()
m.__getitem__.side_effect = d.__getitem__

Does it work? …

>>> m["foo"]
>>> m["bar"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../env/lib/python2.7/site-packages/", line 955, in __call__
    return _mock_self._mock_call(*args, **kwargs)
  File ".../env/lib/python2.7/site-packages/", line 1018, in _mock_call
    ret_val = effect(*args, **kwargs)
KeyError: 'bar'

Well, yes and no. It works as long as you only access those items that you have defined to be in the dictionary. If you try to access any “unknown” item you get a



After trying out different things the simplest answer to accomplish what I set out to do seems to be sub-classing defaultdict.

from collections import defaultdict

class MagicDict(defaultdict):
    def __missing__(self, key):
        result = self[key] = MagicDict()
        return result

And? …

>>> m["foo"]
>>> m["bar"]
defaultdict(None, {})
Traceback (most recent call last):
&nbsp; File "<stdin>", line 1, in <module>
AttributeError: 'MagicDict' object has no attribute 'foo'

Indeed, it is. 😀

Well, not quite. There are still a few comfort features missing (e.g. a proper


). The whole, improved and tested code can be found in this Gist:

# -*- coding: utf-8 -*-
# Author: Riyad Preukschas <>
# License: Mozilla Public License 2.0
from collections import defaultdict
class MagicDict(defaultdict):
def __init__(self, _name=None, **kwargs):
super(MagicDict, self).__init__(**kwargs)
self._name = _name
def __missing__(self, key):
name = "%s[\"%s\"]" %(
(self._name if self._name is not None else "mock"),
result = self[key] = MagicDict(_name=name)
return result
def __eq__(self, other):
return self is other
def __ne__(self, other):
return not self == other
def __repr__(self):
"""Overriden to mimic the output of mock.MagicMock
if self._name is not None:
name_string = " name='%s'" % self._name
name_string = ""
return "<%s%s id='%s'>" % (
view raw hosted with ❤ by GitHub