[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Client->Server bandwidth < Server->Client bandwidth?
Hi Gabriel & list,
On 10/29/13 00:45, Gabriel Kerneis wrote:
> I have a question about daily bandwith usage. On my admin interface, I see:
>
> Server->Client ≃ 3.7 × Client->Server
>
> I wonder if this expected, considering that I am only creating (and deleting in
> a round-robin manner), never restoring archives. I am slightly surprised that
> the amount of control data would be so large, but this is just a naive question,
> I didn't study tarsnap source code.
I think it's the other way around: The amount of data being uploaded is so small.
When you create an archive, tarsnap deduplicates everything locally and only
uploads new blocks. Most of those blocks are (~64kB of) data; some are
metadata consisting of lists of (~1600) data blocks. If you have enough
(~100 MB) of unchanging data all together, the metadata block listing the
data blocks will be identical to a previously stored one, so that won't be
uploaded again either.
When you delete an archive, tarsnap needs to download all the metadata -- all
the lists of blocks -- so that it can adjust its reference counts locally and
figure out which blocks need to be deleted. As a result, the download bandwidth
used by deletes depends only on the size of the archive -- not how well it was
deduplicated when it was created.
If you use the --print-stats option when creating archives, I think you'll find
that the total size of the archive you're creating is much much larger than the
amount of new data being uploaded.
> Could this indicate an issue with my local cache?
No, I can't see any possible connection there.
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid