One Big Repo

Joey Hess joey at kitenet.net
Fri Feb 27 19:55:16 CET 2009


tchomby wrote:
> *   You are less likely to lose files. With many small repos, it becomes almost 
> as easy to lose an entire repo as it was to lose a file before you started 
> versioning your homedir.

I have worried about this too. If you're making new small repos on a
daily basis, then it would be easy to forget to push one out of your
laptop, and lose it in one of the disasters laptops seem to make so
common.

Also, old repos that are no longer used, and that you even stop
checking out, become one server failure and backup oops away from being
lost forever.

> *   With one big repo git log gives you a global history of all your files, a 
> sort of log of what you've been doing on a day-to-day basis. This can be really 
> handy. For example I have to meet with my supervisors every few weeks. Instead 
> of using my memory I can just use git log to help me construct a progress 
> report.

Yeah, I sometimes wish I could make mr construct an interleaved log of
all the repos it runs on.

> All in all I don't understand why many small repos is the recommended approach, 
> sounds like making something simple into something complex. What disadvantages 
> does one big repo have?

I think that most of the disadvantages of using one big repo can be
ignored until you have to share (part of) that repo with others.
Note that wanting to check things out onto multiple machines
eventually will tend toward the same set of problems that sharing
the repo with others will present.

So, some of the specific problems include:

* Participating in typical free software development, which really
  demands one repo per project. Or working for an employer, who probably
  doesn't want their files in your personal repo.
* Needing to keep some set of files private (not letting others see
  them), and some other set *very* private (only on one or two machines).
* Wanting to check large data files into a repo, but not having space
  to put that repo on some machines.
* Having automated commits to some files (of achived mail, for example),
  and not wanting to see that in your general history, or deal with
  the merging/up-to-dateness issues it can entail.
* Wanting to host some files on one server (perhaps one that is
  well-connected to the world), and others on another (perhaps one
  at home, or at work).

I use a mixed approach:

* I have separate repos for files of well-defined types, like mail,
  sound files, personal docs, personal programs, and my web site.
  Basically, one for each top-level directory of my home directory.
* I have separate repos for each free software (or work) project I am
  involved with, and if I start a new project, I start a new repo for it.
  For me, this means only a few new repos each year, hopefully.
* I have a (over?)complicated set of several repos for my dotfiles, so
  that I can have one repo with a minimal set that doesn't take much
  space, another that adds in the larger stuff, and another that adds
  private dotfiles.

Occasionally, something will start out in one place and have to move to
another (ie, mr started out in my personal programs and moved to a
standalone package). But most of the time, there's one obvious place to
put any given file, with an existing repo that replicates it in a way
that's appropriate for that type of file.

-- 
see shy jo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.madduck.net/pipermail/vcs-home/attachments/20090227/8a5f0722/attachment.pgp>


More information about the vcs-home mailing list