Unison FAQ - Tips and TricksI want to ignore deleted files. Is it a good idea to delete archive files after every synchronization, so that Unison consider all of my files as new and copies them on the replica where they are not ?I want to use Unison to synchronize really big replicas. How can I improve performance?When you synchronize a large directory structure for the first time, Unison will need to spend a lot of time walking over all the files and building an internal data structure called an archive. There is no way around this: Unison uses these archives in a critical way to do its work. While you're getting things set up, you'll probably save time if you start off focusing Unison's attention on just a subset of your files, by including the option -path some/small/subdirectory on the command line. When this is working to your satisfaction, take away the -path option and go get lunch while Unison works. This rebuilding operation will sometimes need to be repeated when you upgrade Unison (major upgrades often involve changes to the format of the archive files; minor upgrades generally do not.) Next, you make sure that you are not "remote mounting" either of your replicas over a network connection. Unison needs to run close to the files that it is managing, otherwise performance will be very poor. Set up a client-server configuration as described in the installation section of the manual. If your replicas are large and at least one of them is on a Windows system, you will probably find that Unison's default method for detecting changes (which involves scanning the full contents of every file on every sync---the only completely safe way to do it under Windows) is too slow. In this case, you may be interested in the fastcheck preference, documented in the section "Fast Update Checking" of the user manual . In normal operation, the longest part of a Unison run is usually the time that it takes to scan the replicas for updates. This requires examining the filesystem entry for every file (i.e., doing an fstat on each inode) in the replica. This means that the total number of inodes in the replica, rather than the total size of the data, is the main factor limiting Unison's performance. Update detection times can be improved (sometimes dramatically) by telling Unison to ignore certain files or directories. See the description of the ignore and ignorenot preferences in the section "Preferences" of the user manual . (One could also imagine improving Unison's update detection by giving it access to the filesystem logs kept by some modern "journaling filesystems" such as ext3 or ReiserFS, but this has not been implemented. We have some ideas for how to make it work, but it will require a bit of systems hacking that no one has volunteered for yet.) Another way of making Unison detect updates faster is by "aiming" it at just a portion of the replicas by giving it one or more path preferences. For example, if you want to synchronize several large subdirectories of your home directory between two hosts, you can set things up like this: Create a common profile (called, e.g., common) containing most of your preferences, including the two roots: root = /home/bcpierce
root = ssh://saul.cis.upenn.edu//home/bcpierce
ignore = Name *.o
ignore = Name *.tmp
etc.
Create a default profile default.prf with path preferences for all of the top-level subdirectories that you want to keep in sync, plus an instruction to read the common profile: path = current
path = archive
path = src
path = Mail
include common
Running unison default will synchronize everything. (If you want to synchronize everything in your home directory, you can omit the path preferences from default.prf.) Create several more preference files similar to default.prf but containing smaller sets of path preferences. For example, mail.prf might contain: path = Mail
include common
Now running unison mail will scan and synchronize just your Mail subdirectory. Once update detection is finished, Unison needs to transfer the changed files. This is done using a variant of the rsync protocol, so if you have made only small changes in a large file, the amount of data transferred across the network will be relatively small. Unison carries out many file transfers at the same time, so the per-file set up time is not a significant performance factor. How do I use USB memory stick/flashdrives with Unison?Most memory sticks/flashdrives/pendrives/USB sticks come formatted as FAT. FAT does not support all of the permissions that *nix systems do, so Unison must be told not to check file permissions when syncing to memory sticks. Secondly, I want to synchronise files in different directories to the memory stick. To do that, I create a 'laptop-sync' folder on my laptop. For any file on my laptop that I want to sync, I create a shortcut to it in the laptop-sync folder. That folder is often contains nothing but shortcuts. One other step is to modify the profile to allow links. If you havn't already, create a new Unison profile and point the first (local) directory to the 'laptop-sync' folder. Point the second directory to a folder on the memory stick. To modify the profile, look in the .unison directory and find the .prf (profile) file with the name for the memory stick sync. Edit that and add the following lines at the end: # the follow line tells unison to use links follow = Regex .* # permissions line is necessary for FAT filesystem on the memory stick to work # otherwise you keep getting an error message perms = 0 Is there a way to get Unison not to prompt me for a password every time I run it (e.g., so that I can run it every half hour from a shell script)?It's actually ssh that's asking for the password. If you're running the Unison client on a Unix system, you should check out the 'ssh-agent' facility in ssh. If you do ssh-agent bash
(or ssh-agent startx, when you first log in) it will start you a shell (or an X Windows session) in which all processes and sub-processes are part of the same ssh-authorization group. If, inside any shell belonging to this authorization group, you run the ssh-add program, it will prompt you once for a password and then remember it for the duration of the bash session. You can then use Unison over ssh---or even run it repeatedly from a shell script---without giving your password again. It may also be possible to configure ssh so that it does not require any password: just enter an empty password when you create a pair of keys. If you think it is safe enough to keep your private key unencrypted on your client machine, this solution should work even under Windows. ssh-keygen is used to create a pair of keys to automate the ssh authentication. Here is an example: ssh-keygen (accept all default values)
scp .ssh/id_rsa.pub my.remotehost.com:.ssh/authorized_keys
Can Unison be used with SSH's port forwarding features?Mark Thomas says the following procedure works for him: After having problems with unison spawning a command line ssh in Windows I noticed that unison also supports a socket mode of communication (great software!) so I tried the port forwarding feature of ssh using a graphical SSH terminal TTSSH To use unison I start TTSHH with port forwarding enabled and login to the Linux box where the unison server (unison -socket xxxx) is started automatically. In windows I just run unison and connect to localhost (unison socket://localhost:xxxx/ ...) Richard Murri also commented that the following works for him: ssh machineA -L 9999:machineB:22 unison a.tmp ssh://user@localhost:9999/a.tmp How can I use Unison from a laptop whose hostname changes depending on where it is plugged into the network?This is partially addressed by the rootalias preference. See the discussion in the section "Archive Files" of the user manual . Can I use Unison with version control systems (e.g., CVS, Subversion, darcs)?How can I allow a Unison profile to be initiated from either end?Imagine syncing between a laptop and a desktop computer, using the "minimal profile" from the manual. But you want to be able to initiate the synchronization from both computers. On the desktop use the profile: root = /home/bcpierce root = ssh://laptop//home/bcpierce include homedir While on the laptop you use: root = /home/bcpierce root = ssh://desktop//home/bcpierce include homedir On both ends, you maintain the included file path = current path = common path = .netscape/bookmarks.html path = .unison/homedir.prf Note that the I need to use Unison with Linux and Windows, with unicode characters in file name. How to deal with this situation ?First of all, remember Unison can manage Unicode characters on Linux platform, but not on Windows one (I don't know about this problem on other platforms). So if you need to synchronise between Windows and Linux, let's run Unison on Linux only ! For this, you have to mount the Windows shares on Linux in such a way that Unicode characters will be well managed. Here is a sample of line you have to put in /etc/fstab (you need root access to edit this file): //IP_ADDRESS/WINDOWS_SHARE_NAME /LOCAL_PATH_TO_MOUNT_TO cifs \ rw,uid=UID,gid=GID,umask=000,file_mode=0777,dir_mode=0777,iocharset=utf8,credentials=/root/.smbcred 0 0 Of course, you have to replace:
Values for umask, file_mode and dir_mode should be choosen accordingly to your needs. These samples values are very permissive one, to be used in a trusted environnement. The file "/root/.smbcred" file contents is: username=WINDOWS_USER_NAME password=WINDOWS_USER_PASSWORD Don't forget to add a blank line at the end of this file, it seems to be important for cifs. Of course, this file should be owned be root, to be safe. Now you should be able to synchronize your Linux and Windows hosts, with Unicode in file name ! Of course, there is a (quite big) problem with speed when used on big amount of data, but this is out of the scope of this post ! How do I use ssh with a non-standard port ?In your .prf file use this structure for your root entry: ssh://user@host:port//path/to/directory Put your own user, host, port, and path. The double slash after the port is important. How to synchronize with a computer behind a firewall?In our office most computers are kept behind a firewall with external access via ssh only allowed to the server computer. Interactively, one would log onto the server computer first, and then ssh into the desired machine. I finally figured out how I can do this via a pipe such that I can use unison to synchronize with computers behind the firewall. Set up a script sshpipe.sh, e.g. in /home/user/bin: #!/bin/bash # very simple ssh pipe # Use like this: # unison-gtk -sshcmd /home/user/bin/sshpipe.sh intermediate=gateway.cam.ac.uk ssh $intermediate -C -e none ssh $@ gateway.cam.ac.uk is the name of the machine which is accessible from outside the firewall. Now just run unison with the sshcmd option unison -sshcmd /home/user/bin/sshpipe.sh ... where the root parameters for unison are set up in the usual way, just as if the initiating computer was also behind the firewall. In this case I have hardwired my preferred ssharg option (-C) but it would be relatively straightforward to parse the arguments for options and pass these on to ssh. Of course sshcmd can also be set in the profile rather than as an argument. This description assumes you have set up a ssh keychain mechanism such that ssh from the intermediary to the computer to be synchronized works without a further password check (i.e. the only password required will be the on to ssh from the initiating computer to the intermediary). See http://www.ibm.com/developerworks/library/l-keyc.html or http://www.gentoo.org/proj/en/keychain/ on guidance for setting up ssh keychains. It is possible that the above method even works where the password needs to be entered twice (the first for access to the intermediary, the second to the one to be synchronized with) but I have no easy way of testing this. Is it possible to synchronize two hosts via a removable drive?Yes, there are 2 ways in which this can be done. 1) You can use the removable drive as the hub of your unison sychronization, and sync each computer with the removable drive. 2) There is a hybrid, network/removable drive hack that can be used by creating a shell script. I use this method to keep a file server at my home office and regular office in sync. Pros:
Cons:
So only 1 file per directory will be copied to the removable drive. Here are the steps required:
copyprog = rsync --inplace --compress ""--max-size=xxx"" copyprogrest = rsync --partial --inplace --compress ""--max-size=xxx""
copyprog = sneakernet.sh LocalRoot RemoteRoot SneakerNetPath
#!/bin/sh # Usage # $1 = the local root # $2 = the remote path that unison will pass to rsync # $3 = the path used for the sneakernet, # ie the path to the folder on a usb drive moved between locations # $4 = Souce of the copy as sent by unison # $5 = Destination of the copy as sent by unison # Debuggin stuff OUTFILE=/dev/null SOURCEFILE=$4 DESTFILE=`echo $4 | sed "s|$1|$3|"` DESTDIR=`echo $DESTFILE | sed 's:\(/.*/\).*:\1:'` echo $SOURCEFILE echo $DESTFILE echo $DESTDIR if [ -s $SOURCEFILE ] then #echo Copying from local mkdir -p $DESTDIR >> $OUTFILE cp -af $SOURCEFILE $DESTFILE >> $OUTFILE else SOURCEFILE2=`echo $4 | sed "s|$2|$3|"` DESTFILE2=$5 DESTDIR2=`echo $DESTFILE2 | sed 's:\(/.*/\).*:\1:'` #echo $SOURCEFILE2 #echo $DESTFILE2 if [ -s $SOURCEFILE2 ] then mkdir -p $DESTDIR2 >> $OUTFILE cp -af $SOURCEFILE2 $DESTFILE2 >> $OUTFILE fi fi |