3. Using synctool
The main power of synctool is the fact that you can define logical groups,
and you can add these to a filename as a filename extension. This will result
in the file being copied, only if the node belongs to the same group.
The groups a node is in, are defined in the synctool.conf
file.
In the configuration file, the nodename is associated with one or more groups.
The nodename itself can also be used as a group to indicate that a file
belongs to that node.
Under the synctool root there are these interesting directories:
/opt/synctool/var/overlay/
/opt/synctool/var/delete/
/opt/synctool/var/purge/
This is referred to as ‘the repository’.
The overlay/
tree contains files that have to be copied to the target nodes.
When synctool detects a difference between a file on the system and a file
in the overlay tree, the file will be copied from the overlay tree onto
the node.
The delete/
tree contains files that always have to be deleted from the
nodes. Only the filename matters, so it is alright if the files in this tree
are only zero bytes in size.
The purge/
tree contains directories that are copied as-is to the nodes,
and deleting any files on the target node that are unmanaged — files that
should not be there.
synctool uses rsync
to copy these trees to the node, and afterwards it
runs the synctool-client
command on that node. Note that it is perfectly
possible to run synctool-client
on a node by hand, in which case it will
check its local copy of the repository. The client by itself will not
synchronize with the master repository; synctool works with server push
and not client pull.
In old times, synctool was located under
/var/lib/synctool/
. It worked for me (tm), except that the Filesystem Hierarchy Standard (FHS) has various things to say about it:
- thou shalt put configuration files under an
etc/
directory;- thou shalt not execute programs from the
/var
partition;/var
may be mounted read-only;- programs that want to keep things together, should use
/opt
.If you have difficulty with getting used to synctool’s new root, try this:
- symlink
/var/lib/synctool -> /opt/synctool/var
export overlay=/opt/synctool/var/overlay ; cd $overlay
3.1 Populating the repository
In the repository you will store all the important system configuration files of the cluster nodes. The overlay directory represents the root directory of the cluster nodes. By assigning an extension to a file in the repository, you can tell synctool what nodes should get what copy of a file. Consider this example:
/opt/synctool/var/overlay/all/
etc/ntp.conf._all
etc/ntp.conf._node1
etc/ntp.conf._wn
Here, worker nodes (nodes tagged with group wn
in synctool.conf
) will
get the file ntp.conf._wn
for /etc/ntp.conf
. Node node1
is special
and gets a different file. All other nodes will get ntp.conf._all
.
There is a special group named 'none'
. Files with the extension ._none
will be copied to no nodes at all. This can be convenient when you
temporarily wish to ‘disable’ a file.
synctool responds to the directory directly under overlay/
; it selects
this subtree as a candidate when the node has a matching group. For example,
/opt/synctool/var/overlay/wn/
etc/ntp.conf._all
this file will only be used on worker nodes because it resides in the
overlay directory specific to the group wn
.
Tip: Do not make group-specific overlay directories for each and every group. Instead, think about what subclusters you have, and arrange your repository accordingly. See also chapter 5 on Best Practices.
In synctool version 5, you would configure ‘overlaydir’ and synctool would still consider all overlay directories no matter what name the subdirectory had. In synctool 6 and up, the group is strictly enforced and the subtree is synced to only those nodes that are in the group. Slave nodes are special; they get a full copy of the repository.
To populate the repository, you can scp
files from nodes, or you can use
synctool’s super convenient upload feature:
synctool -n node1 --upload /etc/ntp.conf
synctool -n node1 -u /etc/ntp.conf
synctool will automatically choose an extension for the file to save. If you disagree and want a different suffix, choose one:
synctool -n node1 --upload /etc/ntp.conf --suffix wn
synctool -n node1 -u /etc/ntp.conf -s wn
synctool will suggest the overlay directory where to put the file in the repository. If you disagree, use:
synctool -n node1 --upload /etc/ntp.conf --overlay mycluster
synctool -n node1 -u /etc/ntp.conf -o mycluster
By default synctool does a dry run. It will not do anything but show
what would happen if this would not be a dry run. Add -f
or --fix
to
really upload the file.
Now edit the the uploaded ntp.conf
, make some changes and run synctool:
root@masternode:/# synctool
node1: DRY RUN, not doing any updates
node1: /etc/ntp.conf updated (file size mismatch)
Again, synctool does a dry run. It shows the file is going to be updated because there is a mismatch in the file size. Should the file size be the same, synctool will calculate an MD5 checksum to see whether the file was changed or not.
You may want to review your changes before applying them, or inspect the difference between the version in the repository with what’s currently installed on a node:
synctool -n node1 --diff /etc/ntp.conf
synctool -n node1 -d /etc/ntp.conf
This will present a UNIX ‘diff’ of the files. Note the destination path in the syntax of the command.
To apply the change, you could now run synctool with option --fix
.
But maybe it’s better to read on, we are going to have synctool automatically
reload the ntpd
after updating the ntp.conf
file.
3.2 Adding actions to updates
Now I would like the ntpd
to be automatically reloaded after I change
the ntp.conf
file. This is done by adding a trigger script, in
synctool-speak known as a “.post” script.
Make a new file overlay/all/etc/ntp.conf.post
and put only this line in it:
service ntp reload
Make the .post
script executable: chmod +x ntp.conf.post
.
The .post
script will be run when the file changes:
root@masternode:/# synctool -f
node1: /etc/ntp.conf updated (file size mismatch)
node1: running command $overlay/all/etc/ntp.conf.post
The .post
script is run after synctool updated the file, and likewise,
you may also create a .pre
script that runs before the update:
root@masternode:/# synctool -f
node1: running command $overlay/all/etc/ntp.conf.pre
node1: /etc/ntp.conf updated (file size mismatch)
node1: running command $overlay/all/etc/ntp.conf.post
The .pre
and .post
scripts are executed in the directory where the
accompanying file resides; in this case /etc/
. It is possible to add
a group extension to the script, so that you can have one group of nodes
perform different actions than another.
The scripts are run with sh -c
. Note that /bin/sh
is often not the
same as bash
, so some clever shell scripting tricks may not work. However,
you can fix this by including “#!/bin/bash
” in the top of the .post
script.
In the environment you will find two variables that might be useful:
SYNCTOOL_NODE
is set to the node that we’re running onSYNCTOOL_ROOT
is set to the directory where synctool lives
So expanding on that, $SYNCTOOL_ROOT/bin/
is the bindir, and the repository
is found under $SYNCTOOL_ROOT/var/overlay/
.
A .post
script for a directory will trigger when any file in that directory
changes. This is particularly useful for daemons that have multiple config
files in a directory, such as conf.d
, or, for example, /etc/cron.d
.
A .pre
script for a directory will only trigger if the directory does not
exist and will be created.
3.3 Other useful options
The option -q
of synctool gives less output:
root@masternode:/# synctool -q
node3: /etc/xinetd.d/identd updated (file size mismatch)
If -q
still gives too much output, because you have many nodes in your
cluster, it is possible to specify -a
to condense (aggregate) output.
The condensed output groups together output that is the same for many nodes.
One of my favorite commands is synctool -qa
.
You may also use option -a
to condense output from dsh
, for example
# dsh -a date
# dsh-ping -a
The option -f
or --fix
applies all changes. Always be sure to run
synctool at least once as a dry run! (without -f
).
Mind that synctool does not lock the repository and does not guard against
concurrent use by multiple sysadmins at once. In practice, this hardly ever
leads to any problems.
To update only a single file rather than all files, use the option
--single
or -1
(that’s a number one, not the letter ell).
You may give multiple --single
options to update multiple files at once.
If you want to check what file synctool is using for a given destination
file, use option -ref
or -r
:
root@masternode:/# synctool -q -n node1 -r /etc/resolv.conf
node1: /etc/resolv.conf._somegroup
synctool can be run on a subset of nodes, a group, or even on individual
nodes using the options --node
or -n
, --group
or -g
, --exclude
or -x
, and --exclude-group
or -X
. This also works for dsh
and friends,
and you may use the range syntax to select a range of nodes.
For example:
# synctool -g batch,sched -X rack8
More examples:
# dsh -n node1,node2,node3 date
# dsh -n node[1-3] date
# dsh -n node[01-10] -x node[05-07] hostname
# dsh -n node[02-10/2,05,07] hostname
Copy a file to three nodes:
# dsh-cp -n node[1-3] patchfile-1.0.tar.gz /tmp
After rebooting a cluster, use dsh-ping
to see if the nodes respond to ping
yet. You may also do this on a group of nodes:
# dsh-ping -g rack4
The option -v
gives verbose output. This is another way of displaying
the logic that synctool performs:
# synctool -v
node3: checking $overlay/all/etc/tcpd_banner.production._all
node3: overridden by $overlay/all/etc/tcpd_banner.production._batch
node3: checking $overlay/all/etc/issue.net.production._all
node3: checking $overlay/all/etc/syslog.conf._all
node3: checking $overlay/all/etc/issue.production._all
node3: checking $overlay/all/etc/modules.conf._all
node3: checking $overlay/all/etc/hosts.allow.production._interactive
node3: skipping $overlay/all/etc/hosts.allow.production._interactive,
it is not one of my groups
The option --unix
produces UNIX-style output. This shows in standard shell
syntax just what synctool is about to do.
root@masternode:/# synctool --unix
node3: # updating file /etc/xinetd.d/identd
node3: mv /etc/xinetd.d/identd /etc/xinetd.d/identd.saved
node3: umask 077
node3: cp /var/lib/synctool/overlay/etc/xinetd.d/identd._all
/etc/xinetd.d/identd
node3: chown root.root /etc/xinetd.d/identd
node3: chmod 0644 /etc/xinetd.d/identd
synctool does not apply changes by executing shell commands; all operations are programmed in Python. The option
--unix
is only a way of displaying what synctool does, and may be useful when debugging.
The option -T
option produces terse output. In terse mode, long paths are
abbreviated in an attempt to fit them on a single line of 80 characters wide.
Terse mode can be made to give colored output through synctool.conf
.
root@masternode# synctool -n n1 -T
n1: DRYRUN not doing any updates
n1: mkdir /Users/walter/src/.../testroot/etc/cron.daily
n1: new /Users/walter/src/.../testroot/etc/cron.daily/testfile
n1: exec //overlay/Users/.../testroot/etc/cron.daily.post
Note that these abbreviated paths can still be copy-and-pasted and used with
other synctool commands like --single
and --diff
. synctool will recognize
the abbreviated path and expand it on the fly. In the case of any name clashes
synctool will report this and present a list of possibilities for you to
consider.
The option --skip-rsync
skips the rsync
run that copies the repository
from the master to the client node. You may use this option when you are
absolutely certain that the master and client are already in sync, for example
if you just ran synctool to examine any changes. In general, this option is
unnecessary, but it may be efficient if you are working with slow network
links or a large synctool repository.
3.4 Templates
For ‘dynamic’ config files, synctool has a feature called templates.
There are a number of rather standard configuration files that (for example)
require the IP address of a node to be listed. These are not particularly
synctool friendly. You are free to upload each and every unique instance
of the config file in question into the repository, however, if your cluster
is large this does not make your repository look very nice, nor does it
make them any easier to handle. Instead, make a template and couple it with
a ._template.post
script that calls synctool-template
to generate the
config file on the node.
As an example, I will use a fictional snippet of config file, but this
trick applies to things like sshd_config
with a specific ListenAddress
in it, and network configuration files that have static IPs configured.
# fiction.conf._template
MyPort 22
MyIPAddress @IPADDR@
SomeOption no
PrintMotd yes
And the accompanying fiction.conf._template.post
script:
#! /bin/sh
IPADDR=`ifconfig en0 | awk '/inet / { print $2 }'`
export IPADDR
/opt/synctool/bin/synctool-template "$1" >"$2"
This example uses ifconfig
to get the IP address of the node. You may also
use the ip addr
command, consult DNS or you might be able to use
synctool-config
to get what you need.
The synctool-template
command takes as input the template file (“$1
”)
and redirects the output to a newly generated file (“$2
”). The “$2
”
on the last line expands to fiction.conf._nodename
.
Hence, synctool generates a new config file in the repository. It does so
even on dry runs; you can ask synctool to display a diff of fiction.conf
even though it is templated.
Note not to redirect the output of
synctool-template
directly over the target file. Doing that is destructive and wrong; it defies synctool’s dry-run mode and keeps you from being able to review changes, a core function of synctool.
Instead of using synctool-template
, you might use the UNIX sed
command.
If you have multiple variables to replace, synctool-template
is more easy.
synctool-template accepts variables either from the command-line or from
the shell environment. Like with regular .post
scripts, the environment
variables SYNCTOOL_NODE
and SYNCTOOL_ROOT
are also present here.
However unlike regular .post
scripts, template post scripts require a #!
hashbang line. This is required for shell arguments (like “$1
”, “$2
”)
to work.
Now, when you want to change the configuration, edit the template file. synctool will fill in the template and see the difference with the target file.
Template files and template post scripts can have group extensions to select different templates for certain groups of nodes.
If you want to automatically reload or restart a service after updating
fiction.conf
, you’ll also have to implement a regular .post
script for
that: fiction.conf.post
.
3.5 Purge directories
In the previous sections we saw how you can use the overlay/
and delete/
trees to manage your cluster. synctool has a third mechanism of syncing
files, and it works with the purge/
tree. Purge directories are great for
mirroring entire directory trees to groups of nodes.
Unlike with the overlay/
tree, files in the purge/
tree do not have group
extensions. Instead, synctool will copy the entire subtree and it will
delete any files on the target node that do not reside in the source tree.
So, it will make a perfect mirror of the source under purge/
.
To populate the purge/
tree, use --upload
with the --purge
option:
# synctool -n n1 --upload /usr/local --purge compute
# synctool -n n1 -u /usr/local -p compute
In this example, we want to upload the entire /usr/local
tree from node n1
to the repository directory /opt/synctool/var/purge/compute/
.
Afterwards, all compute nodes will get /usr/local
synced via the purge
mechanism by running synctool -f
.
Purging is a blunt but effective means to synchronise directory trees. Mind that it will delete data that is not supposed to be there, so be careful with this feature. For added safety, synctool will not allow you to purge the root directory of a system.
Under the hood, synctool employs rsync
to purge files. Hence, you can not
trigger actions through .post
scripts in the purge directory, but it is
possible to use synctool --diff
, --ref
, and even --single
with files
that reside under purge/
.
Remember that purging is for making perfect mirrors. It is like sharing a
directory across nodes. Once you start differentiating directory content
between nodes, “purge” will no longer work in a satisfying way; in such a
case, you should really use overlay/
rather than purge/
.
dsh-cp
also has an option --purge
to quickly mirror directories across
nodes. Use with care.
3.6 The order of operations
The previous sections described a lot of operations that synctool performs when it runs. This section summarises what we have seen so far. For a normal synctool run, the order of operations is roughly as follows.
- synchronise the synctool installdir to each node. This synchronises
the repository as well as the main program and config file.
Any subtrees under
overlay
,delete
, andpurge
that do not apply for the target node, are excluded. - run synctool-client on the nodes
- synctool-client mirrors the
purge
directory - synctool-client processes the
overlay
directory;- generate templates by running
.template.post
scripts - compare files
- check filetype
- check file size
- check MD5 checksum
- check file ownership
- check file mode
- make backup copies
- update files as needed
- run
.post script
for any updated files - run
.post script
(if any) for changed directories
- generate templates by running
- synctool-client deletes files listed in the
delete
directory- run
.post script
(if any) for deleted files - run
.post script
(if any) for changed directories
- run
3.7 dsh-pkg, the synctool package manager
synctool comes with a package manager named dsh-pkg
.
Rather than being yet another package manager with its own format of packages,
dsh-pkg is a wrapper around existing package management software.
dsh-pkg unifies all the different package managers out there so you can
operate any of them using just one command and the same set of command-line
arguments. This is particularly useful in heterogeneous clusters or when
you are working with multiple platforms or distributions.
dsh-pkg supports a number of different package management systems and will
detect the appropriate package manager for the operating system of the node.
If detection fails, you may force the package manager on the command-line or
in synctool.conf
:
#package_manager apk
package_manager apt-get
#package_manager brew
#package_manager bsdpkg
#package_manager dnf
#package_manager pacman
#package_manager pkg
#package_manager yum
#package_manager zypper
dsh-pkg knows about more platforms and package managers, but currently only the ones listed above are implemented and supported.
dsh-pkg is pluggable. Adding support for other package management systems is rather easy. If your platform and/or favorite package manager is not yet supported, feel free to develop your own plug-in for dsh-pkg or contact the author of synctool.
The pkg
module is for FreeBSD, use bsdpkg
on other BSD systems.
Following are examples of how to use synctool-pkg.
dsh-pkg -n node1 --list
dsh-pkg -n node1 --list wget
dsh-pkg -g batch --install lynx wget curl
dsh-pkg -g batch -x node3 --remove somepackage
Sometimes you need to refresh the contents of the local package database. You can do this with the ‘update’ command:
dsh-pkg -qa --update
You may check for software upgrades for the node with --upgrade
.
This will only show what upgrades are available. To really upgrade a node,
specify --fix
. It is wise to always test an upgrade on a single node.
dsh-pkg --upgrade
dsh-pkg -n testnode --upgrade -f
dsh-pkg --upgrade -f
Package managers download their packages into an on-disk cache. Sometimes the disk fills up and you may want to clean out the disk cache:
dsh-pkg -qa --clean
A specific package manager may be selected from the command-line.
dsh-pkg -m yum -i somepackage # force it to use yum
If you want to further examine what dsh-pkg is doing, you may specify
--verbose
or --unix
to display more information about what is going on
under the hood.
3.8 Ignoring them: I’m not touching you
By using directives in the synctool.conf
file, synctool can be told to
ignore certain files, nodes, or groups. These will be excluded, skipped.
For example:
ignore_dotfiles no
ignore_dotdirs yes
ignore .svn
ignore .gitignore .git
ignore .*.swp
synctool will not run on ignored nodes or on nodes that are in a group that is ignored:
ignore_node node1 node2
ignore_group broken
3.9 Backup copies
For any file synctool updates, it keeps a backup copy around on the target
node with the extension .saved
. If you don’t like this, you can tell
synctool to not make any backup copies with:
backup_copies no
It is however highly recommended that you run with backup_copies
enabled.
You can manually specify that you want to remove backup copies using:
synctool --erase-saved
synctool -e
To erase a single .saved
file, use option --single
in combination with
--erase-saved
.
For some (Linux) directories like /etc/cron.d/
and /etc/xinet.d/
, it is
not OK to keep .saved
files around because it influences how the daemons
function. For these directories it is recommended that you implement
a .post
script that removes the backup copies, like so:
# $overlay/all/etc/xinetd.d.post
rm -f *.saved
service xinetd reload
Alternatively, you may want to move the backup copies to a safe location.
3.10 Logging
When using option --fix
to apply changes, synctool logs the made changes
to syslog on the master node. It provides a trace of what was changed on the
systems. On large clusters, this may produce a lot of log records. If you
don’t want any logging, you can disable it in synctool.conf
:
syslogging no
When you do use syslogging, you may want to split off the synctool messages
to a separate file like /var/log/synctool.log
. Please see your syslogd
manual on how to do this. In the contrib/
directory in the synctool source,
you will find config files for use with syslog-ng
and logrotate
.
3.11 About symbolic links
synctool requires all files in the repository to have an extension (well … unless you changed the default configuration), and symbolic links must have extensions too. Symbolic links in the repository will be dead symlinks but they will point to the correct destination on the target node.
Consider the following example, where file
does not exist ‘as is’ in the
repository:
$overlay/all/etc/motd._red -> file
$overlay/all/etc/file._red
In the repository, motd._red
is a red & dead symlink to file
. On the
target node, /etc/motd
is going to be fine.
3.12 Slow updates
By default, synctool addresses the nodes in parallel, and they are running updates concurrently. In some cases you might not want to have any parallelism. There are two easy ways around this;
dsh --numproc=1 uptime
dsh -p 1 uptime
dsh --zzz=10 uptime
dsh -z 10 uptime
The first one tells synctool (or in this case, dsh
) to run only one
process at a time. The second does the same thing, and sleeps for ten seconds
after running the command.
Suppose you have a 60 nodes cluster, and run with
--zzz=60
. You now have to wait at least one hour for the run to complete.
The options --numproc
and --zzz
work for both synctool
and dsh
programs.
3.13 Checking for updates
synctool can check whether a new version of synctool itself is available by
using the option --check-update
on the master node. You can check
periodically for updates by using --check-update
in a crontab entry.
To download the latest version, run synctool --download
on the master node.
These functions connect to the main website at www.heiho.net/synctool.
3.14 Running tasks with synctool
synctool’s dsh
command is ideal for running commands on groups of nodes.
On occasion, you will also want to run custom scripts with dsh
.
These scripts can be placed in scripts/
, and dsh
will find them.
When running a command that resides under scripts/
, dsh
will sync this
script to the target node prior to running the command on the remote side.
This is done to make sure that always the ‘current’ version of the script
runs on the target node.
For example, if you have a script /opt/synctool/scripts/admin_example.sh
then you might run:
dsh -n node1 admin_example.sh
No path to the script is required; dsh will find it.
Old versions had a
tasks/
directory under the repository and you could invoke synctool with the--tasks
option. This mechanism has been obsoleted bydsh
and thescripts/
directory.
Note that you can write scripts to do software package installations,
but you may also use the dsh-pkg
command.
3.15 Multiplexed connections
synctool and dsh can multiplex SSH connections over a ‘master’ connection. This feature greatly speeds up synctool and dsh because it allows skipping the costly SSL handshake. Multiplexing is started through dsh:
dsh -M # start master connections
dsh -O check # check master connections
dsh -O stop # stop master connections
dsh -O exit # terminate master connections
You may also do this for certain groups or nodes, like so:
dsh -g all -M
dsh -n node1 -O check
synctool will detect any open control paths and use them if they are present.
The control paths (socket files) to each node are kept under synctool’s temp
directory (by default: /tmp/synctool/sshmux/
).
These control paths are managed by ssh mux processes that are running in the
background. If your cluster is very large, you might find the large number of
ssh mux processes on the management node to be objectionable. These processes
are mostly sleeping so it shouldn’t pose a problem.
The control paths may be given a timeout by using the config parameter
ssh_control_persist
. Note that this parameter is only supported for
OpenSSH 5.6 and later. The timeout may also be specified on the command-line:
dsh -M --persist 4h
The
ControlMaster
andControlPath
options of ssh first appeared in OpenSSH version 3.9. synctool also supportsControlPersist
, which is present in OpenSSH version 5.6 and later. Seeman ssh_config
for more information on these OpenSSH options.