checking which files on a CD are not in a git-annex repo

Thomas Koch thomas at koch.ro
Wed Mar 28 08:05:22 CEST 2012


Joey Hess:
> Thomas Koch wrote:
> > It'd be of course wonderful if I could tell git-annex directly to import
> > all files of the disc. Duplicate files should symlink to the same file
> > in the git- annex backend, shouldn't they?
> 
> Yes. If you don't mind the overhead of copying all the files, simply
> copying the whole CD to a subdirectory and running git annex add will do
> the trick. Any duplicate files will coalesce when added.

Certainly not perfect but good enough:

CDDIR=$1

find $CDDIR -type f -print | while read F
do
#  echo searching $F
  FILENAME=$(basename "$F")
  FOUND=$(find . -path .git -prune -o -name "$FILENAME" -print|head -n 1)
  if [ -r "$FOUND" ]
  then
    echo found $FOUND
  else
    echo not found: $F
    DIRNAME=$(dirname "$F")
    mkdir -p ./"$DIRNAME"
    cp -v "$F" ./"$DIRNAME"
  fi
done 

Still, a solution integrated in git-annex would be wonderful!

Thomas Koch, http://www.koch.ro


More information about the vcs-home mailing list