[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problems restoring hard-linked files



Colin Percival <cperciva@tarsnap.com> wrote:

> Yes, known issue as of about a year ago; as far as I know you're only the
> second person to trip over this.
>
> It's an awkward problem relating to the way the tar format works: Because tar
> is a streaming format, when we see data for the first time there's no way to
> know if that is hardlinked to a file which we will want to extract later --
> and when we come to the hardlink we want to extract later, trying to "rewind"
> the tape is problematic.  (Normal tar utilities run into the same problems,
> incidentally.)
>
> Right now I'm looking at two ways of attacking this:
> 1. Include data in every archive entry, including hard links -- this would
> make archives larger, but tarsnap's deduplication should make that mostly
> irrelevant.
> 2. Make a note of hardlinks where we didn't extract the first copy of the
> data, and then add a second pass through the archive to recover those -- this
> would keep archives the same size, but is considerably more complicated and
> potentially bug-prone due to edge cases like extracting files into directories
> which are being created with read-only permissions.
>
> If anyone has comments on these options or suggestions for other approaches,
> please comment on the github issue I've opened for tracking this:
>   https://github.com/Tarsnap/tarsnap/issues/18

Thanks for the explanation, which makes sense! I'll head on over to github
to comment - I didn't even realise tarsnap was on there!

Cheers, Jamie