[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: duplicate files on different machines, both backed up to the same account



On Wed 05 Sep 2012 03:33:59 PM EDT, David Prager Branner wrote:
> I'm new to Tarsnap as of last night and am very pleased with it. I
> have this question:
>
> I have some duplicate files on different machines, both backed up to
> the same Tarsnap account. It appears to me that in this situation, the
> usual economizing of space and archiving time — as when identical
> content is placed in the different archives on the *same* machine — is
> not taking place. Is that correct? Or is there something I can do to
> effect a savings here?
>
> Many thanks!

 From my understanding, deduplication takes place on the host, not the 
server thus the data being backed up from two different hosts would not 
be deduped against each other. Additionally, since all data is 
encrypted in such a way that only the client is able to decrypt it, the 
same raw data being backed up by two different hosts would be different 
on the server (provided that you use different keys for each host, as 
you should from a security standpoint).

If you wanted to deduplicate backups from multiple hosts against each 
other, you would need to copy the backups to a unified location, then 
do one unified tarsnap backup of all the data from that centralized 
location. This shouldn't be hard to do using rsync (provided your 
machines are in the same physical location or at least have a decent 
amount of bandwidth between them).

Of course, this is all written based on my limited knowledge of the 
internal workings of tarsnap, so I could be wrong. If so, someone feel 
free to correct me :)