[announce] Sharebox, a FUSE filesystem relying on git-annex
Dieter Plaetinck
dieter at plaetinck.be
Thu Mar 31 20:04:15 CEST 2011
On Thu, 31 Mar 2011 18:56:54 +0200
Christophe-Marie Duquesne <chm.duquesne at gmail.com> wrote:
> Hi,
>
> I am currently writing a FUSE file system based on git-annex for
> replicating binary files on several machines. I thought I could share
> it here in order to get some ideas and contributors.
>
> What are your goals?
> Seamless synchronization "à la dropbox".
> Ability to use with big binary files such as mp3/movies.
> Entirely decentralized.
> Don't use unnecessary space
> Keep it simple: avoid special VCS commands and keep a filesystem
> interface as much as possible.
you also need to do various git/git-annex commands, or am I missing something?
> Why?
> Because sparkleshare and dvcs-autosync are bad at versioning binary files
I quite like dvcs-autosync, but it indeed lacks space-efficient storage of big files.
I would like to try if we can use git-annex to support this in dvcs-autosync, although AFAIK git-annex is not transparent in the way regular git is transparent (i.e. it needs to explicitly copy files between locations), I assume this is the reason you need to go for a FUSE-based approach? or do you just prefer this over regular fs + inotify?
> Because Unison needs disk space for each couple of hosts it
> synchronizes and thus does not really scales for more than 2 hosts
> Because Coda is not completely decentralized and it bothers me
you actually tried coda? it's something I'm interested in, on paper it looks like an awesome, maybe-even-perfect open source dropbox-clone but the reality is probably different, I never tried it so I wouldn't know.
> What do you have?
> A python implementation. It is about 600 sloc, and you'll find it on
> https://github.com/chmduquesne/sharebox
> Be careful, it is very alpha and it still does not have a proper
> conflict handler.
>
> Hey, but copying is slow!
> On my machine, copying files to a sharebox fs is about 10 times slower
> than copying it on a normal fs. All the time is spent in python's
> os.write(): I guess the only way to work around this problem is to
> rewrite the whole thing in C, but I am keeping this for later.
hmm, writing files is i/o-bound, I doubt the language will have much effect here.
check with top/vmstat if you get iowait, if so your storage medium is getting saturated and rewriting in C won't help. maybe a network/buffering/.. issue.
> I am interested in:
> - suggestions for the functional design (I have my ideas, but I'd love
> to be challenged).
in your REAMDE you suggest to use a crontab for synchronisation; maybe you can reuse/be inspired by the xmpp system dvcs-autosync uses; it works quite well, it's quite robust and it's instant :)
Dieter
More information about the vcs-home
mailing list