We wrestled with the decision to add it for some time, as we place a very, very high value on the simplicity of our systems. We have no intention of turning rsync.net into a development platform, running a single additional network service, or opening a single additional TCP port.
At the same time, there are a number of very straightforward synchronization and archival functions inherent to subversion and git that lend themselves very well to our offsite filesystem.
In addition, my own interest in DVCS in general, and git in particular as a tool for organizing plain old text files and unix file trees, added new urgency to the implementation.
That being said, we will not be adding any further such systems. There will not be cvs or Mercurial support in the future (although I am aware of some ways to run Mercurial over a git, or subversion "transport" and users are free to do so).
Some details...
Both git and svn can now be run without restrictions on the server side. Previously, we only allowed you to access svnserve through a svn+ssh:// style URL that you would pass to your own applications (like Tortoise, or the command line 'svn' tool). It was impossible, then, to create new repositories or sync from some other repository or view logs, etc. You are now able to do all of these things.
First, let's look at subversion.
We'll start by creating a remote repository in an rsync.net account:
ssh [email protected] "svnadmin create svn_repository"
The svn command is now supported and enables a vast set of functionality which previously wasn't available. For example, if we wanted to keep an up-to-date working copy of one of our projects for which a repository was remotely available (we'll use http://example.com/svn/repository as our remote URL), we could issue the following commands:
ssh [email protected] "svn checkout http://example.com/svn/repository my_checkout" ... list of files checkeChecked out revision 42.
If you'd like to update that checkout later, you can issue a command like:
ssh [email protected] "svn update my_checkout"
The svnsync command is also available. svnsync allows synchronization of repositories. This is useful to keep a full backup of an existing repository. Again, we'll use the URL of http://example.com/svn/repository as the repository for which we want to create a backup. First, we'll create a repository:
ssh [email protected] "svnadmin create synced_repository"
svnsync requires that we initialize the repository prior to doing any syncing. Part of this initialization process requires that we change repository attributes that are normally restricted, so we need to allow these properties to be changed by creating a hook script that exits with a zero return code. In order to do this, we'll create a symlink to the "echo" program:
ssh [email protected] "ln -s /bin/echo synced_repository/hooks/pre-revprop- change"
With that hook script in place, we can now initialize our repository for syncing. Before we do so, we need to know the exact path to our checkout, so let's discover the path to our home directory:
$ ssh [email protected] "pwd" /data2/home/1234
$ ssh [email protected] "svnsync initialize file:///data2/home/1234/synced_repository http://example.com/svn/repository" Copied properties for revision 0 (svn:sync-* properties skipped).
Now, we can issue the following command as often as needed in order to keep the repository up-to-date:
$ ssh [email protected] "svnsync sync file:///data2/home/1234/synced_repository" Committed revision 1. Copied properties for revision 1. ... Committed revision 42. Copied properties for revision 42.
We can also dump a repository to a file by using the 'svnadmin dump' command. Similarly, the database can be restored by using the 'svnadmin load' command. These two commands work in harmony to keep backups of subversion databases, or specific revisions within a database.
To make a single-file backup to my local machine of a repository existing on rsync.net, we can do the following:
$ ssh [email protected] "svnadmin dump existing_repository" > repository_backup.dump 2>repository_backup_errors.txt
The above command creates two files on the local machine, a repository_backup.dump file and a repository_backup_errors.txt file. The former is a single-file backup of the subversion repository and the latter is a list of notifications and errors that might be important.
Finally, the svn client can provide information about a checkout by using the "svn info" command. You can use this information to determine the source repository:
$ ssh [email protected] "svn info existing_repository" Path: . URL: http://example.com/directory Repository Root: http://example.com/directory Repository UUID: dc7efa32-411c-0410-9537-da5bd19367fc Revision: 42 Node Kind: directory Schedule: normal Last Changed Author: admin Last Changed Rev: 42 Last Changed Date: 2010-02-02 12:56:03 -0800 (Tue, 02 Feb 2010)
The svnlook command allows inspection of a repository itself. For example, to inspect the subversion repository and find out its UUID:
$ ssh [email protected] "svnlook uuid svn_repository" dc7efa32-411c-0410-9537-da5bd19367fc
Let's next examine the new git support.
All standard git operations that do not require external programs are supported when the repository is hosted by rsync.net. (See note about client support at the end of this blog post)
To create a git repository on rsync.net we can issue the following command:
$ ssh [email protected] "git init --bare git_repo.git" Initialized empty Git repository in /data2/home/1234/git_repo.git/
Note that the full path to the repository will be required for some operations.
Once a repository has been created, we would want to clone it so we have a local copy on which we could make changes. We can do so with the git clone command, but in order to do so we need the absolute path that was displayed when we initialized the repository:
$ git clone ssh://[email protected]/data2/home/1234/git_repo.git Initialized empty Git repository in /home/kibab/git_repo/.git/
Once you've made changes to your local repository, changes can be pushed back to rsync.net using "git push". By default, any changes we make on our local filesystem will be applied to the repository will be applied to the "master" branch. So, when we push changes up to rsync.net, we'll need to specify which branch should be pushed. When the "git clone" command was issued, git recorded the original repository in the .git/config file.
(NOTE: if you happen to have a hosts file alias for rsync.net, you might need to edit $REPO/.git/config so that if git attempts to bypass the hosts file lookup rsync.net may be correctly resolved.) Since git already knows where the origin (i.e. original) repository is, we can simply issue a push on the local filesystem:
$ git push origin master Counting objects: 3, done. Delta compression using up to 4 threads. Compressing objects: 100% (2/2), done. Writing objects: 100% (3/3), 3.33 MiB | 72 KiB/s, done. Total 3 (delta 0), reused 0 (delta 0) To ssh://[email protected]/data2 * [new branch] master -> master
After pushing the master branch, we'll no longer need to specify the branch we're working with as it will have contents and be selected by default unless otherwise changed.
To pull updates from your git repository, in the event that a different machine pushes some changes up to the git repository, use 'git pull':
$ git pull remote: Counting objects: 4, done. remote: Compressing objects: 100% (3/3), done. remote: Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. From ssh://usw-s003.rsync.net/data2/home/1234/git_repo f4d263b..445595a master -> origin/master Updating f4d263b..445595a Fast-forward TestableJava.pdf | Bin 0 -> 86224 bytes 1 files changed, 0 insertions(+), 0 create mode 100644 TestableJava.pdf
Finally, we can make changes with branching. For this example, let's assume that we've been baking up a settings directory and we're about to embark on a system upgrade that may drastically change all of your settings files. We want to be able to upgrade the settings directory without losing any of the prior work and while retaining the ability to go back to the older settings, if necessary. To do this we'll need to create a new branch:
$ git branch upgrade
We can then see the branch using "git branch" without any parameters:
$ git branch * master upgrade
The asterisk (*) indicates that we're still using the master branch, so let's switch to the upgrade branch:
$ git checkout upgrade Switched to branch 'upgrade'
Now, after performing the upgrade and committing our changes, we can push the changes to the server without losing any of our prior work.
$ git push origin upgrade Counting objects: 4, done. Delta compression using up to 4 threads. Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 1.23 KiB, done. Total 3 (delta 0), reused 0 (delta 0) To ssh://[email protected]/data2/home/1234/git_repo.git * [new branch] upgrade -> upgrade
I can now switch between the two different versions using "git checkout" and use "git diff" to see the differences between the revisions. To revert back to our pre-upgrade configuration, we use "git checkout" one more time:
$ git checkout master Switched to branch 'master'
A special note regarding git client functionality:
git has been configured to be a fully functional server hosted on rsync.net. Although certain operations may work, not all programs have been installed to make it a fully functional client.
A further note regarding hook scripts:
There is no hook script support for either subversion or git. However, as we saw in the above subversion example, where svnsync required the pre-revprop-change hook script, you can symlink /bin/echo to the hook script you need, in your own rsync.net filesystem, to satisfy this rare requirement. It simply returns a zero and satisfies svnsync.
As always, please contact [email protected] if you have any questions.
Great news! Thanks for adding a useful feature to a rock-solid, and exceptionally well managed, service.
Posted by: ludo | February 18, 2010 at 04:11 PM
And Mercurial? ;-)
Posted by: uou | February 22, 2010 at 04:56 PM
I use an ssh config file, with which it appears that there is no need to use absolute paths, making the git remote entry simpler.
My ~/.ssh/config includes (server and username changed):
Host rsync.net
Hostname usw-s003.rsync.net
User 12345
Now all I have to do is:
ssh rsync.net git init --bare test.git
git clone rsync.net:test.git
(etc)
Posted by: Rob | February 23, 2010 at 04:08 PM
Really excellent, thank you. Being able to 'git init' on the server is a whole lot easier than locally creating and scp'ing a bare repo before I can clone. GitHub is great for public stuff, but git+rsync.net just became my first choice for stuff I really can't afford to lose.
Posted by: thomasn | February 23, 2010 at 05:26 PM
@uou : No plans for mercurial, or any other DVCS support in the future. Please see the paragraph above:
That being said, we will not be adding any further such systems. There will not be cvs or Mercurial support in the future (although I am aware of some ways to run Mercurial over a git, or subversion "transport" and users are free to do so).
Once we add mercurial there is going to be another DVCS de jour, and so on. I feel like things have solidified a bit with svn and git as the "standard" offerings and we don't want to add any more binaries and libraries to these systems.
I'd appreciate hearing any comments on our interpretation of the landscape ... is hg supplanting svn and git en masse ?
Posted by: John Kozubik | February 23, 2010 at 05:36 PM
The more standard tool to use when you just need something to return "true" is /bin/true. Any reason not to use that instead of /bin/echo?
Regarding hg, I would guess that it is more likely to supplant svn and maybe cvs than git.
Posted by: Wodin | February 23, 2010 at 11:01 PM
Yay, thank-you! :D
Posted by: fukawi2 | February 23, 2010 at 11:48 PM
I would say Mercurial is worth considering. I say this because it has been around just as long as git has (they both started at the same time) and due to its easier migration path, a lot of subversion users are migrating over to Mercurial rather than git.
One should also consider the fact that there are far more cross-platform third party extensions and apps available for Mercurial than there are for git.
I think the above points together with Mercurial's excellent documentation and the fact that Mozilla, Sun and Python all use it as their DVCS would suggest to me at least that it is not a DVCS de jour and will be around for the long haul (likely to the detriment of Subversion).
Posted by: gpenguin | February 24, 2010 at 06:30 AM
@Wodin : You are correct regarding the 'true' command. The explanation is, we need 'echo' in the chroot jail for other rsync.net functions, and do not need 'true', so it's a very small optimization towards keeping the chroot jail as small and simple as possible. In this case, probably superfluous, since 'true' is so simple.
@ gpenguin : Your points are well taken. There is one other issue, though ... if we can implement mercurial in a manner that is similar to how we implemented svn and git, using only components written in C, then we can consider it. But we cannot put a python interpreter, etc., into the chroot jail.
How much / what portions of hg are in python ?
Posted by: John Kozubik | February 24, 2010 at 07:18 AM
Very cool additions to the service, thanks!
Though I respect your desire to keep things simple, I'm another user who'd be happy to see Mercurial support. Q.v. above arguments.
The "DVCS de jour" problem you fear is no longer relevant IMO -- it's very 2007. Sure, there's a chance that after you add hg somebody will pop up and ask for bzr or darcs -- but they really are outliers at this point. And CVS is a legacy system with well-known migration paths. It's all about Git and Mercurial now.
Posted by: Paul Bx | February 24, 2010 at 07:34 AM
@John Kozubik: I hadn't appreciated the python complication... I believe hg is entirely python based which I guess will rule it out unfortunately. A shame because I love it.
Posted by: gpenguin | February 25, 2010 at 01:05 PM
This is great, great news. Thank you for adding git support.
Posted by: Yves Junqueira | April 23, 2010 at 12:28 PM
There's a way to have better and cleaner URLs.
Look (my userid replaced with 1234):
Neat, no? The real repository root is still /data2/home/1234/repos
I created a new ssh key for svn and added following to my ssh config file:
Then I appended the new public key to 10311's authorized keys, but with a special command option prepended:
Now short path can be used to address the repositories and in case you move the directory somewhere that will only require one single change in the authorized_keys file.
Posted by: Kirill Miazine | August 13, 2010 at 01:09 AM