Underpants Gnomes Project Plan for Moving to Git
- Migrate the Asterisk Test Suite
- Migrate Asterisk
- ???
- Profit.
Git Hosting Comparison
One of the nice things about moving to Git is that there are a ton of great platforms that make using Git easier/better/prettier. One of the terrible things about moving to Git is that there are a ton of great platforms that make using Git easier/better/prettier, and everyone has an opinion about it
Below is a comparison of four platforms, subjected to a variety of criteria:
- Web View - can we use the interwebs to view things
- Project Management - how much does the platform provide outside of source control. To some extent, this is not terribly important for the Asterisk project; issues are going to remain in JIRA at the very least.
- Protected Branches - many developers use private branches in Subversion today. Having the ability to prevent pushes is handy.
- Rewriting History - sometimes it's good to forget; sometimes it's not.
- Arbitrary Repository Creation - users may want more than just a branch.
- Git Hooks/Web Hooks - given the amount of infrastructure the project already has, it is certain we'll need to integrate Git with existing tools.
- Performance - this is a big one, given the amount of history the Asterisk project has. A good management tool will need to be able to handle the nearly half a million commits the Asterisk project currently has.
Github | Atlassian Stash | Gitlab | Gitolite | Gerrit OpenStack Workflow | |
---|---|---|---|---|---|
Web View | Yes | Yes | Yes | Provided by gitweb or CGit (CGit looks like a more attractive solution) | Provided by CGit |
Project Management | Issues, Wiki, Pull Requests, Code Review are all built in. | Issues, Wiki, Pull Requests, Code Review are all built in. | Issues, Wiki, Pull Requests, Code Review are all built in. | None. Although we have alternate solutions for Issues, Wikis, and Code Review, a solution for Pull Requests would be nice. Contributors could work on pull requests using their own Github account and submit their work via patches to Review Board. | Does not encompass Issues. The overall workflow includes code review/project managements through Gerrit and CI through Jenkins/Zuul. This would replace Review Board and Bamboo. |
Protected Branches | Permissions are only supported at the repository level. A client hook could be installed to prevent pushing certain branches but this could not be enforced on the server. | Protected branches are supported, whether or not they have been created yet, but are not enforced when forking or creating a repository that acts as another remote. Custom Stash hooks could be implementing to stop certain branches from been pushed but must be installed manually per repository. | Protected branches are supported but must first be created with at least one file in the given branch. Forking or creating a repository that acts as another remote removes all permissions related to protected branches. The enterprise edition supports git hooks but they appear to be limited at this time although their project page mentions that they are prepared to implement new hooks as requested. These hooks would have to be enabled on a per repository basis. | Protected branches are supported, whether or not they have been created yet. Since permissions are defined using regular expressions, repositories do not have to exist before permissions can be applied. | Gerrit supports permissions on a per branch basis. |
Rewriting History | No protection against rewriting history. A client side hook could be installed to prevent this but could not be enforced on the server. | Permissions allow protecting against rewriting history. | Permissions allow protecting against rewriting history. | Permissions allow protecting against rewriting history. | Permissions allow protecting against rewriting history, but can be overridden. |
Arbitrary Repository Creation (Team Repositories) | Users can create arbitrary repositories under their own Github account for the purposes of creating pull requests. | A team repository project can be created to allow users to create arbitrary repositories. | Users can create arbitrary repositories under their name, in a similar fashion to Github. | A regular expression permission can be setup to create wildcard repositories to support team repositories. | User would need ability to create projects in Gerrit. |
Git Hooks | Installing git hooks is not supported at this time. | Custom Stash hooks can be installed as plugins on a per repository basis. These hooks differ from git hooks in that they must be written using Atlassian's API for hooks using Java. | Only the enterprise edition supports this. | Git hooks are supported plus additional hooks provided by Gitolite to support calling hooks before git is invoked or after a repository is created. | Yes |
Web Hooks | Web hooks are supported and could be used to sync commits with other products/platforms. | Web hooks are not supported but custom Stash hooks could be written to implement syncing with other products/platforms. | Web hooks are supported and could be used to sync commits with other products/platforms. | Web hooks are not supported. Git hooks would have to be created to support syncing with other products/platforms. | Gerrit has an event stream. |
Performance | Github's performance has been showed to be more than adequate. Minimal downtime has occurred in the past. | Year old issue mentions timeouts when cloning large repositories which appears to have been fixed. No other information on performance issues could be found. | Recent issues mention timeouts when cloning large repositories. Another issue mentions problems when rendering graphs for repositories with large number of commits. Issues appear to not have been fixed yet. | Since Gitolite is thin wrapper around git, performance is very close to that of git. I could not find any information on performance issues. | Yes ![]() |
Process Recommendation | A public repository for contributors to post pull requests and clone/pull from along with a private repository for pushing commits to. The public repository would in effect be a mirror of selected branches from the private one. This mirror may or may not be done automatically. A limited number of commiters to control what commits/branches are pushed to the public/private repository. | A public and a private instance with custom Stash hooks to prevent certain branches from been pushed, which would have to be enabled on a per repository basis. The public repository would in effect be a mirror of the private repository where pull requests can be created and contributors can clone/pull from. As with Github, a limited number of commiters to control what commits/branches are pushed to the public/private repositories. Rewriting history should only be allowed for team repositories and for admins. | A public and a private instance with custom enterprise edition hooks to prevent certain branches from been pushed, which would have to be enabled on a per repository basis. The public repository would in effect be a mirror of the private repository where pull requests can be created and contributors can clone/pull from. As with Github and Stash, a limited number of commiters to control what commits/branches are pushed to the public/private repositories. Rewriting history should only be allowed for team repositories and for admins. | A public and a private instance using permissions to protect certain branches and to allow team repositories. Rewriting history should only be allowed for team repositories and for admins. As with Github, Stash, and Gitlab, a limited number of commiters to control what commits/branches are pushed to the public/private repositories. | Use a model similar to open stack model. Anyone can contribute. Once approved, developers with higher permissions can push to a branch. |
Notes
- Given our large number of commits, performance should be one of our most important criteria. Security can be somewhat controlled by limiting the number of commiters to selected branches.
- There's a lot of benefit in not tying our issue tracker, test tools, review tools, and everything to a single management platform. Many of the platforms tie you into a full solution, which may be detrimental in the long run.
Initial Recommendations
Platform | Recommendations |
---|---|
Github | While it is by far the most popular platform, it does have some drawbacks.
|
Stash | While it would plug into our existing Atlassian tools, it also has some drawbacks. |
Gitlab | Potential performance issues are a big knock against this. |
Gitolite | This is the most minimal of the management platforms, and has the fewest end-user features. At the same time, it also gets in the way the least. (We also already have it set up at git.asterisk.org) |
Gerrit | This would require the most tooling changes. At the same time, it also has some obvious benefits (tight integration between source control, code review, and CI), and has worked well for the OpenStack project. |
One of the nice things about git is that it is very easy to set up a mirror on Github for those who want it, but use something else for the "daily" development activities. For now, the initial recommendation is to go with Gitolite, as it has a lot of the backend features that we'd like, is very performant, and doesn't require a re-evaluation of every tool the Asterisk project uses.
For now, we are leaning towards:
- Gitolite
- Gerrit
Mirrors can be set up on Github if desired (similar to the DAHDI project currently).
Moving the Asterisk Test Suite to Git
This is a nice place to start, as it lets us flesh out a lot of the tooling without worrying as much about branches and tags. It's also much smaller.
- Upgrade the existing instance of gitolite (which probably means "purge")
- Determine how we want to manage contributors/authors. Currently, we use a commit message template to note that someone other than the committer wrote the patch; if at all possible, we'd like to incorporate that into the process.
- This means a much larger authors file (if possible)
- It also means no anonymous authors, as we need the e-mail address. That's probably a good thing in the long run.
- Figure out how we want to manage commit access
- SSL certificates
- SSH keys
- Create the repo from SVN. Turn off SVN commit access.
- Migrate ReviewBoard to hit the new Git repo.
- Update the Bamboo scripts/agents to hit the new repo.
- Have cake.
10 Comments
Andrew Latham
In a review of tool chains for current employer I found that Github did not handle authorization the way I expected. This is likely not of importance here but sharing to expand the discussion https://enterprise.github.com/help/articles/about-ldap-authentication#account-synchronization
Ben Langfeld
There's a few more Github clones that might be considered:
http://gogs.io/
https://gitorious.org/
Andrew Latham
I am also tracking https://kallithea-scm.org/ but it is very very new. Reviewing the feature set for comparison is always helpful.
Ben Langfeld
A commit message template to note author vs committer is not necessary - this functionality is built in to git and identities are held in commit headers.
Andrew Latham
Can you provide an example workflow that will track many authors of a patch being committed by a separate developer?
Ben Langfeld
A git commit can only be authored by a single identity. Surely if a patch consists of separate contributions by separate individuals it should be made up of multiple commits?
Matt Jordan
That depends. Often we have patches provided on the issue tracker that were provided by developers that get committed in their original or slightly modified form. Sometimes those developers can be reached - sometimes, they are MIA (different job, different address, etc.)
Today, when we note the original authorship of the patch, various tools in the project pull that information out and provide attribution to that author. It might not make a difference in Subversion, but release notes (as one example) do make use of this field.
While having authors provide the patches in git directly may be sufficient for future contributions, that doesn't deal with the patches currently on the issue tracker - and I'm sure there will be instances where all we have is a diff and an e-mail address.
Ben Langfeld
Also, last point: why is it considered necessary to have private and public versions of a repo? This is not clear in this document and I can not think of any reason for it.
Andrew Latham
Private repos can be very helpful for security patches, testing and even Digium Specific code and work. Every system should provide private or even Wild repos for those use cases.
Matt Jordan
Pretty much this. Every developer today who has commit access has private repos created for them so that they can contribute to security issues or other sensitive work if they so desire.