How to start syncing two existing directories with git annex?

Sean Hammond snhmnd at gmail.com
Wed Nov 27 20:48:38 CET 2013


Thanks for your help Joey,

 >> 1. The total number of files in ~/Annex, not including .git, on A
>> and B is different:
>>
>> ls -R1 ~/Annex | wc -l
>> 21830
>>
>> ls -R1 ~/Annex | wc -l
>> 21845
>>
>> 2. git-annex status shows untracked and modified files on both
>> machines (different files on each machine).
>
> These seem likely to be related. Can you show the status?

Currently on machine A (which has 21845 files) git-annex status outputs 
nothing. On machine B, which as 15 less files, it lists 13 untracked, 31 
deleted and 1 modified file. The output of this command seems to have 
changed since yesterday on both machines, even though I haven't changed 
the files and I thought git-annex finished syncing ages ago.

> Are you using direct mode, or indirect mode?

Direct mode I think. Both annexes were created using the assistant, and 
most of the files in both are files, not symlinks.

>> 3. On each machine, 7 files have been replaced with broken symlinks
>> to files in .git/objects. This time it is the same files on both
>> machines, so it looks as if git-annex might have lost these files
>> from both machines. git-annex fsck finds these 7 and reports them as
>> 'No known copies exist'.
>
> You run git annex log on some of these files to see the history of which
> repository they were in and how they moved around.

For these files git-annex log outputs nothing, on either machine.

>> 4. Even after running git gc --aggressive --prune and git-annex
>> dropunused, the .git directories are massive: 23G and 2.5G, for just
>> ~20,000 files.
>
> Are you looking at the sizes of the .git/objects directories, or the
> .git/annex/objects directories? (.git/annex/tmp is also a possible place where
> cruft could somehow accumulate)

Almost all of the disk usage is in .git/annex/objects/ on both machines.

> When you ran git annex dropunused, did it drop something? git annex unused
> should not find any unused files if you've just synced 2 directories, and never
> deleted any of the files yet.

It did find and drop some unused files, yes.

I think at this point I should probably recover from backup and go back 
to using unison to synchronize my large files directories. It'll never 
live up to what git annex promises, but it's a lot easier to understand 
what's happening. Since almost all the files still seem to be intact, it 
shouldn't take long to rsync back just the files that got changed or lost.

I'll hang on for a bit, just in case we can get to the bottom of what's 
happened here with git-annex.


More information about the vcs-home mailing list