[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Selecting files to back up



On 6/29/20 1:49 PM, Graham Percival wrote:
Sorry for the delay.  Colin got married on Friday, so we were both
a bit distracted.  :)

Nice, congratulate Colin for me (not that he has any idea who I am :-) ), and I totally understand being distracted!

1) generate it via an intermediate archive.  Create one without
a/b, then a new one containing the prevous archive plus "a/b/f2".

        $ tarsnap -c --exclude "a/b" -f backup-tmp a
        $ tarsnap -c -f backup-real @@backup-tmp a/b/f2
        $ tarsnap -d -f backup-tmp
        $ tarsnap -tv -f backup-real
        drwx---rwx  0 td     td          0 Jun 29 12:54 a/
        -rw-r--r--  0 td     td          0 Jun 29 12:54 a/f0
        -rw-r--r--  0 td     td          0 Jun 29 12:54 a/b/f2

    (I used `chmod 707 a` to demonstrate some unusual metadata)

((I was inspired by https://unix.stackexchange.com/a/243875))

I was half way to this. I'd considered using two archives, one with the "exceptions" (those files I want to back up that reside deep within trees I otherwise don't want to back up), and one with the main stuff and completely excluding some trees. The thing I didn't like was having two sets of archives, one with the exceptions and one with everything else, and, despite having read through tarsnap(1) a number of times :-), totally didn't think of including the exceptions archives in the main archives. Given that part, this seems like the way to go and is almost certainly what I'll do.

2) Use the -s option: if a filename is replaced with the empty
string, it won't be backed up.  So if we can match "a/b/* but not
a/b/f2", then it should work.

Unfortunately I'm not very familiar with non-trivial regex, so I
don't have a working example.  So in pseudocode, it would be:

     $ tarsnap -c --dry-run -v -s ',a/b/(.* but not f2),,' a

((I was inspired by
https://github.com/libarchive/libarchive/issues/1071#issuecomment-427807094))

I'll keep on playing with solution #2, so it's possible that I
might be able to send an update.  FWIW, the regex is interpreted
with regcomp() using REG_BASIC.

With perl regular expressions and their lookaheads and lookbehinds, I'm pretty sure I can see how this would work, but I strongly suspect regcomp doesn't support those, no matter what cflags you hand it. And even if it did work, it seems like it would be hard to maintain and modify as I identify new "exceptions" (or drop old ones). So it's an interesting idea, but given that 1 above is straightforward and should be pretty easy to maintain and modify, that seems the way to go.

Thanks for the help Graham!

Brian


Cheers,
- Graham Percival

On Thu, Jun 25, 2020 at 08:12:33PM -0700, Brian L. Matthews wrote:
I'm just starting with tarsnap, and have it built, installed, and running on
my iMac running Mojave, and I've been doing some dry runs to narrow down
what I back up. Mostly things are working as I expect, but I've got one case
that I haven't figured out a way to do. That's where I want to exclude a
directory and its contents *except* for a smattering of files under it.
Here's a simplified example.

Say I have:

$  find a -print
a
a/f0
a/b
a/b/f2
a/b/f1

Then:

$ tarsnap --dry-run --no-default-config -c -v a
a a
a a/f0
a a/b
a a/b/f2
a a/b/f1

works as expected. Now, what I really want is to not archive anything in a/b
except a/b/f2. I can do that with -T:

$ cat T
a/f0
a/b/f2
$ tarsnap --dry-run --no-default-config -c -v -T T
a a/f0
a a/b/f2

but that means I lose all the directory metadata (if I include the
directories, it backups up the directories *and* the files I include
explicitly.) I could use nodump, but I'd have to set nodump on hundreds of
thousands of files, and I don't know what that would interfere with (for
example, one of my local backup methods is Time Machine and I want it to
back up some of the files I don't want tarsnap to back up). I've tried
include and exclude in .tarsnaprc, but nothing I've tried has worked the way
I want. So... any thoughts?

Thanks,
Brian