Best practice for Documents directory: looking for comments on my current setup

Chanoch (Ken) Bloom kbloom at gmail.com
Sun Apr 19 22:34:42 CEST 2009


On Sun, 2009-04-19 at 16:40 +0200, W. Kaplan wrote:
> Hi all,
> 
> I just recently started giving version control systems another go. I'm a
> humanities grad student and not a programmer, so I assume that my needs
> are a little different from those for which these tools were written.
> However, the same applies for managing your whole home directory, so I
> think this list is a good place to ask for opinions.

Keep in mind that if you have merge conflicts in your office files,
whether OpenOffice or MS Word, git and other version control systems are
much less likely to be able to do something sensible about merging them.
The reason we programmers can make it work is because we use text files
for everything. (Our source code is text files, our configuration files
are text files, our scholarly papers are written in LaTeX...).

I don't expect any other synchronization system can reconcile these
files any better though. When someone comes up with a good way to
reconcile these files, I hope we'll steal it and integrate it into git.

> I tried using subversion several times over the past view years but
> never got the hang of it. The reason why I was motivated again is that I
> have a second computer now with which I want to synchronize my files.
> 
> 
> Initially I simply set up one big git repository in $HOME. I didn't
> really care about having a history because I still have my tried
> rsnapshot setup in place. After being surprised by how well this worked
> I read a little more about the subject (including this mailing list).
> 
> That made me try to split things up into smaller repositories as is
> generally recommended. While I kind of see the point in doing so, I'm
> having trouble to move away from big repositories altogether. Let me
> just outline what I'm doing so far and why:
> 
> 1) a git repository in $HOME for selected dotfiles and the Desktop:
> 
> 	I'm not yet concerned with versioning /all/ my dotfiles and
> synchronizing specific ones to specific computers like some of you do. I
> only add the ones I want to have on both computers.

> 2) a single big repository in ~/Documents:
> 
> 	This is me trying to find a balance between a thousand and few
> repositories. This repository's main purpose is also synchronizing and
> the idea is not to rely on it for versioning. I like to think of it as
> an archive instead, because I'm not going to "work" with it (e.g. create
> branches etc.). However, in rare cases it might come in handy to have
> older versions of files.
> 
> 3) several small, specific projects inside ~/Documents
> 
> 	I thought I would do things right for specific projects (e.g. a thesis
> paper) and create repositories for each one. I put those "active"
> projects into ~/Documents/git-projects/, that I ignore in the big
> Documents repository. Once a project is finished I could put the .git
> directory into a tar archive and move the whole directory to another,
> more buried location in ~/Documents. The finished/final versions of the
> files would then be added to the big ~/Documents repository including an
> archive of the .git directory for whatever future purpose. Henceforth I
> wouldn't have to pull that repository on the 2nd computer anymore since
> the file's final versions are in the Documents archive repository.

> The problems I'm aware of are:
> 
> a) Having the big ~/Documents repository is as wrong as a single big
> $HOME repository, because it's still a lot of unrelated stuff in one
> place. I'm actually not perfectly sure why this is so bad. From what I
> understand it would be hard to work with just a subdirectory as if it
> was it's own repository. I think, however, that I should be fine if I
> don't want to do that, but only use it as an archive (see above). In the
> rare case for when I have to retrieve a file from it I would probably be
> okay with a messy commit log. Am I missing other possible problems here?

Git is designed to work best for case 3, but practically everyone in the
vcs-home community has some section of their home directory which works
like case 2. I split mine up into several sections though, for example:
      * Grad school applications. Obviously that part of my life is long
        past.
      * ASUCD (Associated Student body of University of California at
        Davis). Another part of my life that's long past.
      * Undergraduate research. Also long past.
      * Project Foobar. So secret I don't even want it on my laptop, and
        I'd rather lose all extant copies of it than have backups where
        someone else might get their hands on it.
      * Finances. This is somewhat less secret than project Foobar.
      * classes/*. A whole collection of repositories, each one for a
        single course I've taken since my first year as an
        undergraduate. (One even contains files that I've been dragging
        around since my senior year of high school.) No need to check
        all of these out or have their history cloned all over. These
        do, of course, get backed up, and once in a while one of them is
        useful to consult for some class I'm TA'ing or when a class
        project I did impacts on my thesis research in some way.
      * Rare Papers. Research papers I've read that I can't just
        download from the web again when I need them. If I'd used git
        when I started (rather than subversion), then I'd probably have
        a full "papers" repository containing everything I've read, and
        it would be a huge throwaway repository.

In all I have 87 repositories. Only a few of those are checked out at
any given time.

For my dotfiles, I have
      * bin, home-common, editors: stuff seems to be useful on any
        machine I ssh into more than once.
      * mail-configs: useful only on machines I use regularly
      * desktop-environments: useful only on machines that I have
        physical access to.
      * hide: useful only on one specific machine -- contains my GPG key
        among other things.

I don't yet back things up to other remote machines yet, but I'm
actually planning to start doing that tomorrow, when I'll sneakernet
most of my repositories over to my research lab at school (saving myself
a good deal of time and bandwith to clone them over the network), and
then use git pull/push thereafter so that only incremental changes need
to be transfered over the network.

> b) Since the ~/Documents/git-projects directory is being ignored by the
> Documents repository I have to manually create it on the other computer
> and manually pull the projects inside it. Right now I only have to deal
> with 2 active projects, so I don't mind managing this by hand, but what
> if I create more projects? Maybe that will be the time to try the "mr" tool.

This is what the mr tool is for.

> I read about but haven't actually tried the "mr" tool yet. I don't think
> that I want to go down that road. As far as I understand it, I could
> create repositories in different places anywhere in Documents/ and would
> not have to remember them all, because "mr", once configured, would take
> care of it. While this seems nice it also seems inefficient to keep
> pulling repositories that will not see any new commits ever though.

I keep all my repositories in /home/bloom_git, then clone things into my
home directory in /home/bloom. For me, any repository that's old and
rarely used will only live in /home/bloom_git, and not actually get
checked out.

> Looking at my current approach I feel like I'm using git for several
> different purposes: simple synchronizing (of dotfiles), synchronizing
> and archiving (of the Documents directory) and actual project
> management. This makes me wonder if I'm trying too hard to use one tool
> here. Maybe there are other tools to be considered?

I use git for all of these things. There are a few places where I feel
subversion might be a better fit (a few particularly conflict-prone
configuration files are easier to blow away local changes without
creating the extra commits that git needs), but I'm going to make a go
of it with git.

--Ken

-- 
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.madduck.net/pipermail/vcs-home/attachments/20090419/39f131d6/attachment.pgp>


More information about the vcs-home mailing list