[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unique files
On 10/25/14 12:21, Scott Robison wrote:
> Is there a means to determine what is "new" between two "full" archives?
Not directly.
> I do a full archive of an entire partition each day, and am a little
> surprised by how much new data exists. I would like to "diff" the two
> archives, and am hoping there is something built in that is relatively
> more efficient than a brute force approach. I can brute force it if need
> be, but would rather not. Thanks!
Because Tarsnap's deduplication happens after files have been squished
together into a tar stream and that tar stream has been split into
blocks, it's not feasible to track backwards to figure out which file
a particular new block came from. (For that matter, you can get blocks
which contain pieces from several different files.)
The best trick I've found for tracking down what is changing (aside from
the obvious 'find . -mtime -1d') is to run tarsnap with a small value
for --maxbw-rate (e.g., --maxbw-rate 50000) and then send it a SIGINFO
(or SIGUSR1 if your OS doesn't have SIGINFO) every second. This will
prompt tarsnap to repeatedly print its current progress, and when it
slows down dramatically you've found a place where it is finding lots of
new data which it needs to upload.
Colin Percival