Recovering from an interrupted rdiff-backup

I use rdiff-backup a bit to script nightly backups for the servers I set up. Good tool that does its job well… Except, when the backup has been interrupted mid-way through. Across networks this happens occasionally and for me when it does happen rdiff-backup will not run normally again without administrator intervention.

Recovery steps in detail

I'm using rdiff-backup 1.1.5 and Python 2.4.4. The error I get looks like this

Exception '' raised of class 'exceptions.AssertionError':
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 295, in error_check_Main
    try: Main(arglist)
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 315, in Main
    take_action(rps)
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 273, in take_action
    elif action == "check-destination-dir": CheckDest(rps[0])
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 774, in CheckDest
    need_check = checkdest_need_check(dest_rp)
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 810, in checkdest_need_check
    if not force: curmir_incs[0].conn.regress.check_pids(curmir_incs)
  File "/var/lib/python-support/python2.4/rdiff_backup/connection.py", line 448, in __call__
    return apply(self.connection.reval, (self.name,) + args)
  File "/var/lib/python-support/python2.4/rdiff_backup/connection.py", line 367, in reval
    for arg in args: self._put(arg, req_num)
  File "/var/lib/python-support/python2.4/rdiff_backup/connection.py", line 139, in _put
    else: self._putobj(obj, req_num)
  File "/var/lib/python-support/python2.4/rdiff_backup/connection.py", line 144, in _putobj
    self._write("o", pickle.dumps(obj, 1), req_num)
  File "pickle.py", line 1386, in dumps
    Pickler(file, protocol, bin).dump(obj)
  File "pickle.py", line 231, in dump
    self.save(obj)
  File "pickle.py", line 293, in save
    f(self, obj) # Call unbound method with explicit self
  File "pickle.py", line 614, in save_list
    self._batch_appends(iter(obj))
  File "pickle.py", line 647, in _batch_appends
    save(x)
  File "pickle.py", line 293, in save
    f(self, obj) # Call unbound method with explicit self
  File "pickle.py", line 737, in save_inst
    stuff = getstate()
  File "/var/lib/python-support/python2.4/rdiff_backup/rpath.py", line 754, in __getstate__
    assert self.conn is Globals.local_connection

Traceback (most recent call last):
  File "/usr/bin/rdiff-backup", line 23, in ?
    rdiff_backup.Main.error_check_Main(sys.argv[1:])
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 295, in error_check_Main
    try: Main(arglist)
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 315, in Main
    take_action(rps)
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 273, in take_action
    elif action == "check-destination-dir": CheckDest(rps[0])
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 774, in CheckDest
    need_check = checkdest_need_check(dest_rp)
  File "/var/lib/python-support/python2.4/rdiff_backup/Main.py", line 810, in checkdest_need_check
    if not force: curmir_incs[0].conn.regress.check_pids(curmir_incs)
  File "/var/lib/python-support/python2.4/rdiff_backup/connection.py", line 448, in __call__
    return apply(self.connection.reval, (self.name,) + args)
  File "/var/lib/python-support/python2.4/rdiff_backup/connection.py", line 367, in reval
    for arg in args: self._put(arg, req_num)
  File "/var/lib/python-support/python2.4/rdiff_backup/connection.py", line 139, in _put
    else: self._putobj(obj, req_num)
  File "/var/lib/python-support/python2.4/rdiff_backup/connection.py", line 144, in _putobj
    self._write("o", pickle.dumps(obj, 1), req_num)
  File "/usr/lib/python2.4/pickle.py", line 1386, in dumps
    Pickler(file, protocol, bin).dump(obj)
  File "/usr/lib/python2.4/pickle.py", line 231, in dump
    self.save(obj)
  File "/usr/lib/python2.4/pickle.py", line 293, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.4/pickle.py", line 614, in save_list
    self._batch_appends(iter(obj))
  File "/usr/lib/python2.4/pickle.py", line 647, in _batch_appends
    save(x)
  File "/usr/lib/python2.4/pickle.py", line 293, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.4/pickle.py", line 737, in save_inst
    stuff = getstate()
  File "/var/lib/python-support/python2.4/rdiff_backup/rpath.py", line 754, in __getstate__
    assert self.conn is Globals.local_connection
AssertionError
Fatal Error: Lost connection to the remote system

My first attempt at fixing this problem was to try using the check-destination-dir option since the rdiff-backup man page claims that “running rdiff-backup with this option on the destination dir will undo the failed directory”

/usr/bin/rdiff-backup --check-destination-dir root@remote.backup.host::/path/to/backup

But this didn't work and produced similar (or maybe even the same) error as the one above. Next I decided to try using the “force” option since I had read descriptions from others finding success in fixing rdiff-backp errors with this option. From the rdiff-backup man page the force option will “Authorize a more drastic modification of a directory than usual”.

# /usr/bin/rdiff-backup --force --check-destination-dir root@remote.backup.host::/path/to/backup
# echo $?
0

OK, that completed successfully, for completeness list the available backups.

# /usr/bin/rdiff-backup -l root@remote.backup.host::/path/to/backup

Found 239 increments:
    increments.2007-10-07T02:11:45-07:00.dir   Sun Oct  7 02:11:45 2007
    increments.2007-10-18T09:55:29-07:00.dir   Thu Oct 18 09:55:29 2007
    increments.2007-10-18T10:56:32-07:00.dir   Thu Oct 18 10:56:32 2007
.
.
.
    increments.2008-06-09T19:00:02-07:00.dir   Mon Jun  9 19:00:02 2008
    increments.2008-06-10T19:00:03-07:00.dir   Tue Jun 10 19:00:03 2008
    increments.2008-06-11T19:00:03-07:00.dir   Wed Jun 11 19:00:03 2008
Current mirror: Thu Jun 12 19:00:03 2008

To summarize, the error message was long and bewildering. However the only thing that had “changed” since the last successful backup was that a normal backup was interrupted. The backup location had become corrupted requiring the use of the “force” option to make the backup location usable again.

Warren Howard 2008/06/24 02:01

techblog/recoverying_from_an_interrupted_rdiff-backup.txt · Last modified: 2008/08/22 04:36 by warren
Recent changes · Show pagesource · Login