GitMigration

From reSIProcate
Revision as of 08:38, 11 September 2014 by Dpocock (talk | contribs)
Jump to navigation Jump to search

Introduction

This page is a central resource for information about the migration of the reSIProcate repository from Subversion to Git.

The migration is anticipated in September 2014.

Subversion uses a client-server architecture. A dedicated server running FreeBSD hosts the reSIProcate Subversion repository as well as other services such as the download pages, https://list.resiprocate.org mailing lists], SVN browser, doxygen generated API documentation and Bugzilla.

In comparison, Git uses a distributed architecture. Every local copy includes a full history of all commits and all published branches. Many people choose to periodically synchronize their local copy of a Git repository with a single server to simulate the client-server model, however, Git does not insist on this paradigm.

Github

The Github site/service offers one possibility for hosting a repository that people sync their local repositories with. Due to the distributed architecture of Git, using Github does not lock us in.

In the initial conversion to Git, we will try using Github but a cron job will sync the repository onto the existing server every 10 minutes.

If and when we can replicate all of the Github features on a local server, we may stop using Github in future.

Github benefits

  • Browsing the code online
    • This is one of the easiest things to replicate on our own server, using cgit
  • Easy for non-committers to post patches for review as pull requests
    • also reduces the effort for the committers to receive and inspect patches as Github automatically builds them
    • the alternative to this is to set up Gerrit and Jenkins and we will still maintain the option to do that in future
  • Fast sync with travis-ci.org
    • travis will quickly detect commits and notify committers if they break the build
    • the alternative to this is to run a Jenkins server
  • community
    • Github produces interesting reports on each committer and their links to other projects
  • reporting
    • the reports can be built on our own server using something like GitStats in future
  • Access control
    • Git itself doesn't implement access control in any way (although it does allow people to PGP sign tags)
    • ACLs are normally implemented in the network, for example, using SSH keys with Git over SSH
    • The Gitosis tool provides a convenient way to manage this if we eventually want to fully host the project repository without Github

Github limitations

  • issue tracker is very basic
    • Our Bugzilla is more powerful and we will keep using it for now
    • therefore, Github issue tracker is administratively disabled
  • wiki is very basic
    • MediaWiki is more powerful
    • Many pages link to the existing wiki
    • therefore, Github wiki is administratively disabled
  • Github's release page generates tarballs on the fly
    • the release page generates tarballs of any tag on the fly
    • unfortunately, this means they don't have consistent checksums, two people downloading the same tag won't see the same checksum
    • therefore, we will continue hosting release tarballs ourselves
    • unfortunately, Github does not provide an option to administratively disable/hide their release page

Learning Git

  • For Windows users
    • [Quick_Windows_Git_Installation] explains how to setup the plugin for Visual Studio and then [Quick_Windows_Git_Clone_Checkout] explains the next steps.
  • For command line users, please see the excellent Git documentation and online manual

Migrating the history to Git

  • We have been using sync2git to sync the SVN repository into a Git mirror every hour
  • The file svn-authors.txt was created to map the SVN user IDs to the preferred Git identity of each user
    • we have tried to map these to the email addresses that people registered on Github so their commits will have hyperlinks to their Github profiles
  • All branches and tags should be correctly mapped in Git
  • This appears to have worked out well and this mirror repository will become the canonical repository at the moment we decide to go live on Git.

Access control issues

Existing committers getting access on Github

  • Please contact Daniel Pocock, Scott Godin or Philip Kizer, tell us your Github user ID and we will add you to the access list on Github

Existing scripts that rely on SVN

  • Some people may have scripts for polling the SVN repository
    • For example, the script that runs doxygen to update the API documentation pages
  • Github provides a convenient SVN API, it is particularly good for read-only purposes
  • These scripts can simply be modified to read from the Github SVN API
  • The Github project URL is also the SVN URL, it can detect the requests from an SVN client and respond appropriately
  • In this SVN view, the *main* branch is actually called *trunk* (and it is called *master*) in Git
  • Example:

$ svn ls https://github.com/resiprocate/resiprocate branches/ tags/ trunk/

Uncommitted changes in local SVN workspaces

  • If you have uncommitted changes in a local SVN workspace, you will need to manually copy them into a Git repository
    • use svn status to find any new files and svn diff to view changed files
    • make a git clone and check out the relevant branch
    • manually copy the files you identified into the git clone directory and commit them