Skip to content
This repository was archived by the owner on Jun 15, 2025. It is now read-only.

Commit 5b8b1df

Browse files
committed
Introduce remote command execution support
Arguably, in a fair share of cases one would want to backup data not to a local device but rather a remote entity. One possibility for remotely accessing file systems is by mounting them locally using NFS, for example. However, once mounted over NFS, certain file system properties are abstracted from. The particularities of a btrfs file system, for instance, are not exposed through NFS (and similar approaches). This constraint renders any such mounting related approach effectively unusable for btrfs-backup. In order to still be able to backup data remotely, this change introduces support for running commands remotely instead of only locally. Since btrfs-backup is basically a sophisticated command execution engine around the btrfs tool suite, we can execute the commands not only locally but also remotely, using SSH, for instance (this fact of course implies that the remote side has a btrfs file system available along with the btrfs programs). With this change it is now possible to specify a remote command to use by means of the newly introduced --remote-cmd option. The command will be used to prefix all actually performed btrfs commands which means that the remote command must accept the command to execute remotely as its argument. Note that although the approach was designed with SSH in mind, really any command that allows for connecting to a remote host and running a command should work.
1 parent 0aff36c commit 5b8b1df

File tree

6 files changed

+146
-32
lines changed

6 files changed

+146
-32
lines changed

btrfs-backup/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,3 +93,28 @@ achieved by means of the snapshots-only option, like so:
9393
$ btrfs-backup restore --snapshots-only
9494
--subvolume=subvolume/ backup/ snapshots/
9595
```
96+
97+
### Remote Execution
98+
In many cases it is a requirement to backup a subvolume to a remote host
99+
(or restore a subvolume from it). Mounting a remote btrfs file system
100+
locally by means of, for instance, sshfs will not provide the ability to
101+
use btrfs specific tools on it.
102+
To that end, commands can be run on the remote host directly (provided
103+
it offers an interface for command execution from the outside and that
104+
is has the required btrfs tool suite installed). A typical example for
105+
remote command execution is SSH. Using **btrfs-backup** on a remote host
106+
by means of an SSH connection can be achieved with the --remote-cmd
107+
option:
108+
109+
```
110+
$ btrfs-backup backup --remote-cmd='/usr/bin/ssh server'
111+
--subvolume=subvolume/ snapshots/ backup/
112+
```
113+
114+
In this example, we assume that by invoking '/usr/bin/ssh server'
115+
locally we can establish a connection to the remote server. The command
116+
specified using the --remote-cmd option has to be given with the full
117+
path. Furthermore, this command must accept the command to execute
118+
remotely (that is, on the host named 'server' in our example above) as
119+
its arguments. Note that backup/ in this case does not refer to a local
120+
folder but rather one on the remote side.

btrfs-backup/src/deso/btrfs/main.py

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ def version():
5454
return "0.1"
5555

5656

57-
def run(method, subvolumes, src_repo, dst_repo, **kwargs):
57+
def run(method, subvolumes, src_repo, dst_repo, remote_cmd=None, **kwargs):
5858
"""Start actual execution."""
5959
try:
6060
# This import pulls in all required modules and we check for
@@ -65,8 +65,14 @@ def run(method, subvolumes, src_repo, dst_repo, **kwargs):
6565
print("A command was not found:\n\"%s\"" % e, file=stderr)
6666
return 1
6767

68+
if remote_cmd:
69+
# TODO: Right now we do not support remote commands that contain
70+
# spaces in their path. E.g., "/bin/connect to server" would
71+
# not be a valid command.
72+
remote_cmd = remote_cmd.split()
73+
6874
try:
69-
program = Program(subvolumes, src_repo, dst_repo)
75+
program = Program(subvolumes, src_repo, dst_repo, remote_cmd)
7076
method(program)(**kwargs)
7177
return 0
7278
except ChildProcessError as e:
@@ -112,7 +118,13 @@ def addStandardArgs(parser):
112118

113119

114120
def addOptionalArgs(parser):
115-
"""Add the optional --reverse argument to a parser."""
121+
"""Add the optional arguments to a parser."""
122+
parser.add_argument(
123+
"--remote-cmd", action="store", dest="remote_cmd", metavar="command",
124+
help="The command to use for running btrfs on a remote site. Needs "
125+
"to include the full path to the binary or script, e.g., "
126+
"\"/usr/bin/ssh server\".",
127+
)
116128
parser.add_argument(
117129
"--reverse", action="store_true", dest="reverse", default=False,
118130
help="Reverse (i.e., swap) the source and destination repositories.",
@@ -235,9 +247,11 @@ def main(argv):
235247

236248
if ns.command == "backup":
237249
return run(lambda x: x.backup, subvolumes, src_repo, dst_repo,
250+
remote_cmd=ns.remote_cmd,
238251
keep_for=ns.keep_for)
239252
elif ns.command == "restore":
240253
return run(lambda x: x.restore, subvolumes, src_repo, dst_repo,
254+
remote_cmd=ns.remote_cmd,
241255
snapshots_only=ns.snapshots_only)
242256
else:
243257
assert False

btrfs-backup/src/deso/btrfs/program.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,11 @@
2828

2929
class Program:
3030
"""A program object performs the actual work of synchronizing two repositories."""
31-
def __init__(self, subvolumes, src_repo, dst_repo):
31+
def __init__(self, subvolumes, src_repo, dst_repo, remote_cmd=None):
3232
"""Create a new Program object using the given subvolumes and repositories."""
3333
self._subvolumes = subvolumes
3434
self._src_repo = Repository(src_repo)
35-
self._dst_repo = Repository(dst_repo)
35+
self._dst_repo = Repository(dst_repo, remote_cmd)
3636

3737

3838
def backup(self, keep_for=None):

btrfs-backup/src/deso/btrfs/repository.py

Lines changed: 33 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,8 @@ def _parseDiffLine(line):
125125
return path
126126

127127

128-
def _snapshots(directory):
129-
"""Retrieve a list of snapshots in a directory.
128+
def _snapshots(repository):
129+
"""Retrieve a list of snapshots in a repository.
130130
131131
Note:
132132
Because of a supposed bug in btrfs' handling of passed in
@@ -136,7 +136,8 @@ def _snapshots(directory):
136136
usage of this function is discouraged. Use the Repository's
137137
snapshots() method instead.
138138
"""
139-
output, _ = execute(*listSnapshots(directory), read_out=True)
139+
cmd = repository.command(listSnapshots, repository.path())
140+
output, _ = execute(*cmd, read_out=True)
140141
# We might retrieve an empty output if no snapshots were present. In
141142
# this case, just return early here.
142143
if not output:
@@ -147,9 +148,10 @@ def _snapshots(directory):
147148
return [_parseListLine(line) for line in output]
148149

149150

150-
def _isRoot(directory):
151+
def _isRoot(directory, repository):
151152
"""Check if a given directory represents the root of a btrfs file system."""
152-
output, _ = execute(*show(directory), read_out=True)
153+
cmd = repository.command(show, directory)
154+
output, _ = execute(*cmd, read_out=True)
153155
output = output.decode("utf-8")[:-1].split("\n")
154156

155157
# The output of show() contains multiple lines in case the given
@@ -160,7 +162,7 @@ def _isRoot(directory):
160162
return len(output) == 1 and output[0].endswith(_SHOW_IS_ROOT)
161163

162164

163-
def _findRoot(directory):
165+
def _findRoot(directory, repository):
164166
"""Find the root of the btrfs file system containing the given directory."""
165167
assert directory
166168
assert directory == abspath(directory)
@@ -169,7 +171,7 @@ def _findRoot(directory):
169171
# or later because of a dirname invocation. However, the show command
170172
# in _isRoot will fail for an empty directory (a case that will also
171173
# be hit if this function is run on a non-btrfs file system).
172-
while not _isRoot(directory):
174+
while not _isRoot(directory, repository):
173175
new_directory = dirname(directory)
174176

175177
# Executing a dirname on the root directory ('/') just returns the
@@ -302,8 +304,9 @@ def _findCommonSnapshots(src_snaps, dst_snaps):
302304
def _createSnapshot(subvolume, repository, snapshots):
303305
"""Create a snapshot of the given subvolume in the given repository."""
304306
name = _snapshotName(subvolume, snapshots)
307+
cmd = repository.command(mkSnapshot, subvolume, repository.path(name))
305308

306-
execute(*mkSnapshot(subvolume, repository.path(name)))
309+
execute(*cmd)
307310
return name
308311

309312

@@ -314,7 +317,7 @@ def _findOrCreate(subvolume, repository, snapshots):
314317
# If we found no snapshot or if files are changed between the current
315318
# state of the subvolume and the most recent snapshot we just found
316319
# then create a new snapshot.
317-
if not snapshot or _diff(snapshot, subvolume):
320+
if not snapshot or _diff(snapshot, subvolume, repository):
318321
old = snapshot["path"] if snapshot else None
319322
new = _createSnapshot(subvolume, repository, snapshots)
320323
return new, old
@@ -355,12 +358,12 @@ def _deploy(snapshot, parent, src, dst, src_snaps, subvolume):
355358

356359
# Be sure to have the snapshot persisted to disk before trying to
357360
# serialize it.
358-
execute(*syncFs(src.root))
361+
execute(*src.command(syncFs, src.root))
359362
# Finally transfer the snapshot from the source repository to the
360363
# destination.
361364
pipeline([
362-
serialize(src.path(snapshot), parents),
363-
deserialize(dst.path())
365+
src.command(serialize, src.path(snapshot), parents),
366+
dst.command(deserialize, dst.path()),
364367
])
365368

366369

@@ -440,7 +443,8 @@ def _restore(subvolume, src, dst, snapshots, snapshots_only):
440443
# Now that we got the snapshot back on the destination repository,
441444
# we can restore the actual subvolume from it (if desired).
442445
if not snapshots_only:
443-
execute(*mkSnapshot(dst.path(snapshot), subvolume, writable=True))
446+
cmd = dst.command(mkSnapshot, dst.path(snapshot), subvolume, writable=True)
447+
execute(*cmd)
444448

445449

446450
def restore(subvolumes, src, dst, snapshots_only=False):
@@ -455,7 +459,7 @@ def restore(subvolumes, src, dst, snapshots_only=False):
455459
_restore(subvolume, src, dst, snapshots, snapshots_only)
456460

457461

458-
def _diff(snapshot, subvolume):
462+
def _diff(snapshot, subvolume, repository):
459463
"""Find the files that changed in a given subvolume with respect to a snapshot."""
460464
# Because of an apparent bug in btrfs(8) (or a misunderstanding on my
461465
# side), we cannot use the generation reported for a snapshot to
@@ -469,7 +473,8 @@ def _diff(snapshot, subvolume):
469473
# to clarify whether a new snapshot *always* also means a new
470474
# generation (I assume so, but it would be best to get
471475
# confirmation).
472-
output, _ = execute(*diff(subvolume, generation), read_out=True)
476+
cmd = repository.command(diff, subvolume, generation)
477+
output, _ = execute(*cmd, read_out=True)
473478
output = output.decode("utf-8")[:-1].split("\n")
474479
# The diff output usually is ended by a line such as:
475480
# "transid marker was" followed by a generation ID. We should ignore
@@ -500,7 +505,8 @@ def _purge(subvolume, repository, duration, snapshots):
500505
# old enough so that the snapshot should be deleted.
501506
time = datetime.strptime(string, _TIME_FORMAT)
502507
if time + duration < now:
503-
execute(*delete(repository.path(snapshot)))
508+
cmd = repository.command(delete, repository.path(snapshot))
509+
execute(*cmd)
504510

505511

506512
def _trail(path):
@@ -510,18 +516,19 @@ def _trail(path):
510516

511517
class Repository:
512518
"""This class represents a repository for snapshots."""
513-
def __init__(self, directory):
519+
def __init__(self, directory, remote_cmd=None):
514520
"""Initialize the object and bind it to the given directory."""
515521
# We always work with absolute paths here.
516522
directory = abspath(directory)
517523

518-
self._root = _findRoot(directory)
524+
self._remote_cmd = remote_cmd
525+
self._root = _findRoot(directory, self)
519526
self._directory = _trail(directory)
520527

521528

522529
def snapshots(self):
523530
"""Retrieve a list of snapshots in this repository."""
524-
snapshots = _snapshots(self._directory)
531+
snapshots = _snapshots(self)
525532

526533
# We need to work around the btrfs problem that not necessarily all
527534
# snapshots listed are located in our repository's directory.
@@ -584,14 +591,20 @@ def diff(self, snapshot, subvolume):
584591
if not found:
585592
raise FileNotFoundError("Snapshot not found: \"%s\"" % snapshot)
586593

587-
return _diff(found, subvolume)
594+
return _diff(found, subvolume, self)
588595

589596

590597
def path(self, *components):
591598
"""Form an absolute path by combining the given path components."""
592599
return join(self._directory, *components)
593600

594601

602+
def command(self, function, *args, **kwargs):
603+
"""Create a command."""
604+
command = function(*args, **kwargs)
605+
return (self._remote_cmd if self._remote_cmd else []) + command
606+
607+
595608
@property
596609
def root(self):
597610
"""Retrieve the root directory of the btrfs file system the repository resides on."""

btrfs-backup/src/deso/btrfs/test/testMain.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@
4444
)
4545
from deso.execute import (
4646
execute,
47+
pipeline,
4748
)
4849
from os.path import (
4950
join,
@@ -174,6 +175,66 @@ def testKeepFor(self):
174175
self.assertEqual(len(glob(m.path("snapshots", "*"))), 1)
175176

176177

178+
def testRemoteCommand(self):
179+
"""Verify that the remote command correctly prefixes some btrfs commands."""
180+
remote_cmd = "/usr/bin/ssh server"
181+
182+
def isRemote(command):
183+
"""Check if a command is a remote command."""
184+
return " ".join(command).startswith(remote_cmd)
185+
186+
def filterCommand(command):
187+
"""Filter all remote command parts from a command (if any)."""
188+
# Since we do not have an SSH server running (and do not want to),
189+
# filter out the remote command part before execution.
190+
if isRemote(command):
191+
command = command[2:]
192+
193+
return command
194+
195+
def filterCommands(commands):
196+
"""Filter all remote command parts from a command list (if any)."""
197+
return [filterCommand(cmd) for cmd in commands]
198+
199+
def executeWrapper(*args, **kwargs):
200+
"""Wrapper around the execute function that stores the commands it executed."""
201+
command = list(args)
202+
all_commands.append(command)
203+
command = filterCommand(command)
204+
205+
return execute(*command, **kwargs)
206+
207+
def pipelineWrapper(commands, *args, **kwargs):
208+
"""Wrapper around the pipeline function that stores the commands it executed."""
209+
for command in commands:
210+
all_commands.append(command)
211+
212+
commands = filterCommands(commands)
213+
return pipeline(commands, *args, **kwargs)
214+
215+
all_commands = []
216+
with patch("deso.btrfs.repository.execute", wraps=executeWrapper),\
217+
patch("deso.btrfs.repository.pipeline", wraps=pipelineWrapper):
218+
with alias(self._mount) as m:
219+
make(m, "subvol", subvol=True)
220+
make(m, "snapshots")
221+
make(m, "backup")
222+
223+
remote = "--remote-cmd=%s" % remote_cmd
224+
args = "backup --subvolume {subvol} {src} {dst}"
225+
args = args.format(subvol=m.path("subvol"),
226+
src=m.path("snapshots"),
227+
dst=m.path("backup"))
228+
result = btrfsMain([argv[0]] + args.split() + [remote])
229+
self.assertEqual(result, 0)
230+
231+
# We do not check all commands here. However, we know for sure that
232+
# btrfs receive should be run on the remote side. So check that it
233+
# is properly prefixed.
234+
receive, = list(filter(lambda x: "receive" in x, all_commands))
235+
self.assertTrue(isRemote(receive), receive)
236+
237+
177238
def testRun(self):
178239
"""Test a simple run of the program with two subvolumes."""
179240
def wipeSubvolumes(path, pattern="*"):

btrfs-backup/src/deso/btrfs/test/testRepository.py

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,6 @@
3838
Repository,
3939
restore,
4040
_findCommonSnapshots,
41-
_findRoot,
4241
_snapshots,
4342
sync as syncRepos,
4443
)
@@ -92,21 +91,22 @@ def testRepositoryListNoSnapshotPresent(self):
9291
def testRepositorySubvolumeFindRootOnRoot(self):
9392
"""Test retrieval of the absolute path of the btrfs root from the root."""
9493
with alias(self._mount) as m:
95-
self.assertEqual(_findRoot(m.path()), m.path())
94+
repo = Repository(m.path())
95+
self.assertEqual(repo.root, m.path())
9696

9797

9898
def testRepositorySubvolumeFindRootFromBelowRoot(self):
9999
"""Test retrieval of the absolute path of the btrfs root from a sub-directory."""
100100
with alias(self._mount) as m:
101-
directory = make(m, "dir")
102-
self.assertEqual(_findRoot(directory), m.path())
101+
repo = Repository(make(m, "dir"))
102+
self.assertEqual(repo.root, m.path())
103103

104104

105105
def testRepositorySubvolumeFindRootFromSubvolume(self):
106106
"""Test retrieval of the absolute path of the btrfs root from a true subvolume."""
107107
with alias(self._mount) as m:
108-
root = make(m, "root", subvol=True)
109-
self.assertEqual(_findRoot(root), m.path())
108+
repo = Repository(make(m, "root", subvol=True))
109+
self.assertEqual(repo.root, m.path())
110110

111111

112112
def testRepositoryFindCommonSnapshots(self):
@@ -217,13 +217,14 @@ def testRepositoryListNoSnapshotPresentInSubdir(self):
217217
with alias(self._mount) as m:
218218
# Create a new sub-directory where no snapshots are present.
219219
directory = make(m, "dir")
220+
repository = Repository(directory)
220221
# TODO: The assertion should really be assertEqual! I consider
221222
# this behavior a bug because we pass in the -o option to
222223
# the list command which is promoted as: "print only
223224
# subvolumes below specified path".
224225
# There is no subvolume below the given path, so reporting a
225226
# snapshot here is wrong.
226-
self.assertNotEqual(_snapshots(directory), [])
227+
self.assertNotEqual(_snapshots(repository), [])
227228

228229

229230
def testRepositoryListOnlySnapshotsInRepository(self):

0 commit comments

Comments
 (0)