One Big Repo

tchomby tchomby at googlemail.com
Fri Feb 27 11:32:49 CET 2009


It's generally said that when using git it's best to break things up into many 
small repos, e.g. one per project, module, etc., rather than dumping it all 
into one big repo, and the same advice has been given about using git for your 
homedir. I can see why this is a good idea for source code, but for something 
like versioning your homedir I don't see it. I've been using the multiple 
repositories approach for a while now, but I think I'd be better off if I went 
to One Big Repo for most stuff.

Using one big git repo for your homedir has many advantages over using many 
small repos:

*   It's simpler.

*   You only have to create the repo once. Creating new repos is a PITA. After 
a simple git init; git add .; git commit; you have to make a bare clone of the 
repo, scp that to your central server, then update the original repo to track 
the central clone, _and_ clone the repo onto your other machines, add it to 
your mrconfig file... It's complicated enough that things are likely to go 
wrong.

*   You are less likely to lose files. With many small repos, it becomes almost 
as easy to lose an entire repo as it was to lose a file before you started 
versioning your homedir. It sort of defeats the point. With one big repo I just 
commit a new file to my repo and forget about it, then I know I'll never lose 
that file, the point is to avoid me having to think about it. With many repos I 
have to consider which repo a new file should belong to, even whether I should 
create a whole new repo for it.

*   With many repos you have to somehow keep track of them all so you need a 
tool like mr, one more tool to learn, and that means you need to manage a 
mrconfig file.

*   With one big repo git log gives you a global history of all your files, a 
sort of log of what you've been doing on a day-to-day basis. This can be really 
handy. For example I have to meet with my supervisors every few weeks. Instead 
of using my memory I can just use git log to help me construct a progress 
report.

*   You can still easily get a log or diff for a single project by doing it 
over a single file or directory.

All in all I don't understand why many small repos is the recommended approach, 
sounds like making something simple into something complex. What disadvantages 
does one big repo have? All I can think of is that the repo will get bigger and 
bigger over time. But if it ever really did get too big you could make an 
archive backup of the whole thing then delete the repo and start afresh with 
only your current set of files. I doubt you'd have to do this very often at 
all.


More information about the vcs-home mailing list