Sub Repositories
Git and subversion repositories work well with a high number of files, they keep track of all changes we made to them and we can revert those changes.
Usually, we start with one repository and we keep adding files and folders to it. In the beginning those files and folders are related but, at some point, they tend to be unrelated.
When projects grow, usually different modules are created. Even though they relate each other, they tend to be developed by different persons at different paces with different management workflows. This can be a serious challenge to management.
So, the usual git solutions is to separate each module into an external git repository, and creating a git repository as an umbrella Git repository. This umbrella git repository will hold the reference to each external git repository where the module is.
Advantages:
- Separate git management workflows
- Module development focus is not lost (only module commits are shown)
- More control on who can commit to where
- Umbrella project enables code compilation using all modules
Disadvantages:
- Can lead to duplication of code
- Code dependencies checking
- Can be necessary to change code Compilation path(s)
How can this be performed ?
Repo tool
This Repo tool [1] [2] [3] [4] was developed for Android Open Source Project. It is a git wrapper to facilitate the management of several git repositories. It has several advantages over git submodules, however it is another tool that is necessary to add it to the system (locally or globally).
It is fairly simple to use and manage. Instead of having an umbrella repository, it only needs a git repository with a default.xml file. This XML file describes where are the remote git servers and how they are manage in the local system.
Example of Default.xml
<?xml version="1.0" encoding="UTF-8"?> <manifest> <remote name="opensourceprojects" fetch="http://opensourceprojects.eu" /> <default remote="opensourceprojects" sync-c="true" sync-j="4" /> <project path="local/users-extractor" name="git/p/timbus/context-population/extractors/local/users-extractor-perl" revision="master"/> <project path="local/debian-sw" name="git/p/timbus/context-population/extractors/local/debian-sw" revision="master"/> <project path="local/network-info-perl" name="git/p/timbus/context-population/extractors/local/network-info-perl" revision="master"/> <project path="local/perl-modules" name="git/p/timbus/context-population/extractors/local/perl-modules" revision="master"/> </manifest
Start working with Repo
If we don't have curl installed we need to install it with:
$ sudo apt-get install curl
Then prepare the environment and download repo from Google Repositories
$ mkdir ~/bin $ echo "PATH=~/bin:\$PATH; export PATH" >> ~/.bashrc $ . ~/.bashrc $ curl http://commondatastorage.googleapis.com/git-repo-downloads/repo > ~/bin/repo $ chmod a+x ~/bin/repo
If you prefer you can download this script to ease the environment installation. Bear in mind that you already need to have the curl package installed.
After the download you will need to change its execution permission to 700.
chmod 700 multi_repository_preparation.sh
And finally clone the repository with the Default.xml and sync all code repositories.
$ mkdir umbrella_repository $ cd umbrella_repository $ SSL_NO_VERIFY=true repo init -u https://opensourceprojects.eu/git/p/timbus/context-population/extractors/local/local-extractors $ repo sync
The repo metadata is stored in a folder with the name ".repo" in the same folder where you use the "repo init" command. If for some reason there is an error, you can remove that folder.
Git submodules
From git the response to this problem is the following:
- Create all external git repositories
- Create a new repository where you can add the externals repositories into a separate folder
- Git keeps track (with some user help) of external git repositories
Some code:
mkdir umbrella-git; cd umbrella-git git init
Now we have an empty umbrella git. Let's start to add the external git repositories.
git submodule add https://url1 git1 git submodule add https://url2 git2 git submodule add https://url3 git3 git submodule add https://url4 git4 git submodule add https://url5 git5
url1 ... url5 and git1 ... git5 are just examples of urls and local folders.
This creates git1 to git5 folders and a .gitmodules file. The .gitmodules file has the url and location of the external git repository, this way git knows that specific folders are submodules and has to treat them differently.
Now it's time to commit the .gitmodules and the submodules folders (git1 to git5).
We can do this with:
git commit -a -s -m "Added all external submodules"
Cloning the umbrella git repository
To clone this kind of repositories is necessary to explicit say to git that it needs to checkout recursively all submodules with the command:
git clone --recursive http://url/umbrella-git.git umbrella-git
Working with submodules
Git Submodules has some added features to work with submodules. It is possible through foreach keyword to execute a command in each submodule folder.
git submodules foreach command
If you need to keep track of all submodules new commits you can do the following:
git submodules foreach git remote update
Or if you want to pull all changes into master branch from a remote repository.
git submodules foreach git pull origin master
Or, if you want to tag all submodules with a specific tag.
git submodules foreach git tag -a version-1.0 -m "Code version 1"
Drawbacks
Even though git submodules seems a great feature, it lacks[5] [6] some tools to track submodules dynamic behaviour.
Git Subtree
Git subtrees [7]
References
[1]: http://source.android.com/source/developing.html
[2]: http://source.android.com/source/using-repo.html
[3]: http://source.android.com/source/downloading.html
[4]: http://google-opensource.blogspot.pt/2008/11/gerrit-and-repo-android-source.html
[5]: http://blogs.atlassian.com/2013/03/git-submodules-workflows-tips/
[6]: http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/
[7]: https://raw2.github.com/git/git/master/contrib/subtree/git-subtree.txt