[announce] Sharebox, a FUSE filesystem relying on git-annex
joey at kitenet.net
Sun Apr 3 05:19:52 CEST 2011
Dieter Plaetinck wrote:
> @Joey: you mentioned you think inotify might be a better
> backend/paradigm for this than fuse, so do you think implementing
> git-annex in something like dvcs-autosync is feasible? and/or
Feasable? Certianly. Preferable? I'm in the "let a thousand flowers
bloom phase". It's spring. :)
As Christophe-Marie has pointed out, git-annex makes annexed files
semi-immutable, and FUSE can hide that quirk, while inotify watching cannot.
That could be confusing for certian users or use cases, if they are not
aware of what is going on. Or it could be something quickly learned
about how these special replicated directories work, that files have to
be copied to be changed.
This is also an area I hope to improve in git-annex, by using git smudge
filters. So it might get a mode where files can be modified and git
commit just annexes the new content. Last time I looked at this, git was
not *quite* there to let it be done efficiently.
> I quite like dvcs-autosync (partially because inotify is more simple
> than fuse, partially because it currently works already quite well) and I'm
> interested in making it support space efficient storage of big files;
> from what I've read it should be possible to do this with git-annex
> (which should not even change how we currently deal with small files,
> they would still be in git) but I'm still doing my first baby steps
> with git-annex so I wouldn't know. Advice very welcome..
All it probably needs at is simplest is something like this
(excuse the haskell):
toobig <- checkFileSize file
then git_annex_add file
else git_add file
> Another note : files being tracked with git-annex through sharebox or
> dvcs-autosync or whatever should always have at least 1 "backup copy",
> so that if the file gets deleted everywhere, it still can be retrieved
> from somewhere (which raises the interesting question: where will you
> store this backup copy? introducing a node/repository which will hold
> backup copies can be considered going to a centralized model; which is
> something you (Christophe-Marie) try to explicitly avoid, but I think
> this is not necessarily a problem)
This is something git annex goes to large lengths to deal with.
It will enforce N backup copies; it tracks which other repositories
have which files; it can transfer wanted file contents from other
repositories in either a decentralized or a centralized manner; the
other repositories can be on other drives of the same computer, or
accessible by ssh, or even, now, Amazon S3.
see shy jo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 828 bytes
Desc: Digital signature
More information about the vcs-home