One Big Repo

Joey Hess joey at
Fri Jul 10 21:25:12 CEST 2009

chombee wrote:
> On Fri, Feb 27, 2009 at 01:55:16PM -0500, Joey Hess wrote:
> > So, some of the specific problems include:
> > 
> > <snip>
> > * Wanting to check large data files into a repo, but not having space
> >   to put that repo on some machines.
> I think a good idea might be to have a special repo for big files 
> only. So you would have two general catch-all repos, one for really 
> big files and one just for small files. Right now I put every file 
> that doesn't belong somewhere else into one catch-all repo, whether 
> the file is big or small. But there's no reason why I shouldn't be 
> able to check out some text files and documents because I committed a 
> big bunch of PNG images.

I set this up myself recently. I have a git repo that I commit things
like every photo I suck off my camera, scans, and videos to. I think
of it as my raw data repository.

The bare repo is on my file server; my laptop clones it as follows:

	git clone --shared /media/server/path/raw.git

This way the laptop does not have the overhead of the full .git repository,
it can just access that from the file server (nfs or sshfs). But
I can still commit locally and push to the server later.

If I commit a lot of big stuff and my local .git repo gets too big, this
dangerous command tries to ensure it's all been pushed to the file
server, and then cleans it out locally:

zap () {
	if [ -e .git/objects/info/alternates ]; then
		git push
		rm -vf .git/objects/??/*
		echo "not a --shared repo!"

Only remaining problem is that checking really enormous files, such
as videos I am working on, into git makes git allocate memory for the whole
file. Needing to set up swap just to git commit a 700 mb dv file on my netbook
is a trifle annoying. :-P

I also use branches a lot in this repo, so that my netbook only keeps the
currently used files checked out. I figure that when this repo gets too big,
I'll just archive it off elsewhere, and start a new one.

> > * Having automated commits to some files (of achived mail, for example),
> >   and not wanting to see that in your general history, or deal with
> >   the merging/up-to-dateness issues it can entail.
> Has anyone got this working (automated commit of archived mail)? 
> Currently I use offlineimap run by cron to sync my mail to a local 
> directory, then another cron job uses rsync to backup this directory, 
> just in case something goes wrong with the live copy. It'd be cool to 
> backup the mail directory by committing to a git repo.

Sure, I use the attached trimail script, which in turn uses archivemail
to move the read mail from the offlineimap maildirs into archival mailboxes,
and is run from cron nightly.

see shy jo
-------------- next part --------------
#!/bin/sh -e
# Archive old mail.

cd ~/mail/archive

# Move read mail that is older than 1 day old out of inbox folders and into
# archive.
for folder in `find ~/Maildir ~/Maildir/.* -maxdepth 0 -type d -not -name .. -not -name .Drafts`; do
	dest=$(basename $folder | sed 's/^\.//')
	if [ "$dest" = "" ] || [ "$dest" = "Maildir" ]; then
	date=$(date +%Y-%m)
	install -d $dest
	if [ "$dest" = spam ] || [ "$dest" = virii ]; then
		# Keep for 7 days, then delete.
		archivemail -d7 --delete $folder
	elif [ "$dest" = postmaster ]; then
		# Keep for 1 day, then delete. While I'm getting flooded
		# anyhow.
		archivemail -d1 --delete $folder
		archivemail -u -d2 -o $dest \
			--archive-name=$date $folder

for dir in `find -maxdepth 1 -mindepth 1 -type d -not -name .git`; do
	# Compress mail not compressed by archivemail.
	find $dir -maxdepth 1 -type f -regex '.*/[0-9]*-[0-9]*$' -exec gzip -9 {} \;

	# Either check old archives in, or delete them after a month.
	if [ -n "$(git log -n 1 -- "$dir")" ]; then
		git add `find $dir -maxdepth 1 -type f -regex '.*/[0-9]*-[0-9]*.gz'` 2>/dev/null || true
		find $dir -maxdepth 1 -type f -mtime +31 -exec rm -f {} \;
if ! git commit -q -a -m "autocommit" 2>/dev/null ; then
	echo "git commit failed" >&2
	exit 1
elif ! git push 2>/dev/null ; then
	echo "git push failed" >&2
	exit 1

More information about the vcs-home mailing list