Building a tool to make the process automated

Ken Bloom kbloom at
Wed Jun 13 16:10:12 CEST 2007

On Tuesday 12 June 2007 08:24:50 pm you wrote:
> > I'd like to know a little more about Bazaar-NG's suitability for
> > this task as compared to Subversion.
> 1. Performance and Repository Size
> Bazaar has done a lot to improve performance over the last few
> versions and they are continuing to do so.
> On the slightly outdated link you can see some performance/size
> measures. It is important to note that BZR has done a lot since then
> to improve performance and repository size so I am not too worried
> about the results it got.
>html Then, there is this link:
> with this quote talking about SVN:
> "The FSFS backend places one file per revision in a single directory;
> a test import of Mozilla generated hundreds of thousands of files in
> this directory, causing performance to plummet as more revisions were
> imported. I'm not sure what each file contains, but it seems like
> revisions are written as deltas to an existing revision, making
> damage to one file propagate down through generations. Lack of strong
> error detection means such errors will be undetected by the
> repository. CVS used to suffer badly from this when NFS would
> randomly zero out blocks of files.
> The Mozilla CVS repository was 2.7GB, imported to Subversion it grew
> to 8.2GB. Under Git, it shrunk to 450MB. Given that a Mozilla
> checkout is around 350MB, it's fairly nice to have the whole project
> history (from 1998) in only slightly more space."

Did they run svnadmin deltify on this?

> Obviously the article is about Git, but it does show that SVN has
> huge repository size requirements which is definitely not wanted
> here. It is interesting though that so many of you have used SVN and
> have not gotten any problems from your repository size. It is really
> something that I thought was going to be a major problem.
> A big advantage of Bzr is that it is distributed so you can backup or
> put your repository a bit anywhere (kinda solves the svn external
> issue a bit). Another one is that you can push your changes using
> ssh, http or using many other methods.

That's not what svn:externals is about, I don't think. It's about being 
able to piece together a home directory from a repository that includes 
all of the information necessary to build several different home 
directory configurations -- different subsets for different machines. I 
do the same thing, but I just check out the parts I need on an ad-hoc 

> Another advantage is that you don't get all these .svn directories
> everywhere. With bzr you just get a .bzr directory as the root
> directory of your repository. Yet another advantage is that you can
> have lightweight checkouts, which lets all the history actually be
> stored on a remote server or an external drive.

I was really curious not about sizes, but about logistics. Is it easy to 
back up (and perform incremental backups) on a bzr repository? Do I 
make $HOME on some machine be the main repository, or do I make a 
separate repository that isn't a working directory, and then have 
repository like features in each of my home directories? If the former, 
then is it easy to back up the repository but still exclude all manner 
of temporary files?

I like the distributed bit. Is it possible to check out parts of 
repositories, and piece different parts of a repository together into a 
single home directory the way people do with svn:externals?

> > > Under SVN, how do you make sure that the repository does not grow
> > > too
> > > large over the years? I would assume that after a few years, you
> > > might want to just keep a monthly granularity of your backups so
> > > as to reduce the size of the repository?
> >
> > Hasn't been a problem for me yet, in fact with my backup script
> > (also on the wiki somewhere) I've found that my mail spool grows
> > more quickly than my repository.
> Is there any way I could get some numbers. How big is the $HOME
> directory without the version control files? How big is the
> repository?

My home directory includes directories from the 3 repositories:

karvelian:/var/svn is 45768 Kb
cat-in-the-hat:/home/bloom_svn/ is 241284 Kb
cat-in-the-hat:/home/procurent/ is 21856 Kb

My home directory excluding ~/scratch, excluding **/.svn, excluding 
~/Maildir, but including compiled binaries, PDFs generated from TeX 
documents, and the like (things that would be hard to exclude from a 
backup using any method other than version control) is 279632 Kb.
This is an estimate, since I don't have an alternate policy now for 
determining what would be included in a backup and what wouldn't.

I have not run svnadmin deltify on any of these repositiories.

> I think a big advantage is to keep the file around in the repository
> so
> > that if I need to find it later, I can. What I do need is a better
> > way to search through the repository and restore something when I
> > need it. The svn command never seems to do it quite right, and when
> > it does, it's not easy. Any ideas?
> Apple unveiled the Time Machine and I like how you can just search
> for a file that has been deleted and it will give it to you using the
> Finder/Time Machine interface. I guess just adding a search
> command/box that returns you all the files (visually or using the
> command line) that match which have been in the repository would fix
> that.

I'm starting to think about whether one of the web-based subversion 
repository viewers may be what I want, at least for now.


Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <>

More information about the vcs-home mailing list