[announce] Sharebox, a FUSE filesystem relying on git-annex

Dieter Plaetinck dieter at plaetinck.be
Thu Mar 31 20:04:15 CEST 2011


On Thu, 31 Mar 2011 18:56:54 +0200
Christophe-Marie Duquesne <chm.duquesne at gmail.com> wrote:

> Hi,
> 
> I am currently writing a FUSE file system based on git-annex for
> replicating binary files on several machines. I thought I could share
> it here in order to get some ideas and contributors.
> 
> What are your goals?
> Seamless synchronization "à la dropbox".
> Ability to use with big binary files such as mp3/movies.
> Entirely decentralized.
> Don't use unnecessary space
> Keep it simple: avoid special VCS commands and keep a filesystem
> interface as much as possible.

you also need to do various git/git-annex commands, or am I missing something?
 
> Why?
> Because sparkleshare and dvcs-autosync are bad at versioning binary files

I quite like dvcs-autosync, but it indeed lacks space-efficient storage of big files.
I would like to try if we can use git-annex to support this in dvcs-autosync, although AFAIK git-annex is not transparent in the way regular git is transparent (i.e. it needs to explicitly copy files between locations), I assume this is the reason you need to go for a FUSE-based approach? or do you just prefer this over regular fs + inotify?

> Because Unison needs disk space for each couple of hosts it
> synchronizes and thus does not really scales for more than 2 hosts
> Because Coda is not completely decentralized and it bothers me

you actually tried coda? it's something I'm interested in, on paper it looks like an awesome, maybe-even-perfect open source dropbox-clone but the reality is probably different, I never tried it so I wouldn't know.
 
> What do you have?
> A python implementation. It is about 600 sloc, and you'll find it on
> https://github.com/chmduquesne/sharebox
> Be careful, it is very alpha and it still does not have a proper
> conflict handler.
> 
> Hey, but copying is slow!
> On my machine, copying files to a sharebox fs is about 10 times slower
> than copying it on a normal fs. All the time is spent in python's
> os.write(): I guess the only way to work around this problem is to
> rewrite the whole thing in C, but I am keeping this for later.

hmm, writing files is i/o-bound, I doubt the language will have much effect here.
check with top/vmstat if you get iowait, if so your storage medium is getting saturated and rewriting in C won't help. maybe a network/buffering/.. issue.

> I am interested in:
> - suggestions for the functional design (I have my ideas, but I'd love
> to be challenged).

in your REAMDE you suggest to use a crontab for synchronisation; maybe you can reuse/be inspired by the xmpp system dvcs-autosync uses; it works quite well, it's quite robust and it's instant :)


Dieter


More information about the vcs-home mailing list