syncing non-git trees with git-annex

Adam Spiers vcs-home at adamspiers.org
Wed Dec 14 14:15:23 CET 2011


On Wed, Dec 14, 2011 at 12:53 PM, Richard Hartmann
<richih.mailinglist at gmail.com> wrote:
> I would use
>
>  find -name \*.avi -exec git annex add {} \;

That's substantially less performant, because it forks a whole new
tree of git / git-annex processes per file.

If we're getting picky, we should also worry about spaces in
filenames:

  find -name \*.avi -print0 | xargs -0 git annex add

but it was only an example, and to be honest, I didn't even use the
xargs variant myself; I used zsh's recursive globbing:

  git annex add **/*.avi

but I thought that might confuse non-zsh users so I replaced it with
xargs in the example :-)

> but other than that, this seems fine.

Great, thanks!

> Depending on your level of OCD,
> it might make sense to throw away the initial repo once your data is
> clean and import everything in a clean one. That's what I do.

Interesting - any particular reason for doing that?

> PS: There are various tools for finding duplicates, but git-annex
> gives you this functionality for free, so..

Yes, this one is particularly good:

  http://sourceforge.net/projects/fastdup/

but using git-annex facilitates it in a much more distributed fashion.


More information about the vcs-home mailing list