[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Database backup and deduplication question
On 12/23/11 11:49 PM, Colin Percival wrote:
> Can you try
>
> # tarsnap --dry-run -cvf testarchive file1 file1
> # tarsnap --dry-run -cvf testarchive part1 part1
> # tarsnap --dry-run -cvf testarchive part2 part2
> # tarsnap --dry-run -cvf testarchive part3 part3
>
> You should get a perfect 2:1 deduplication ratio (modulo overhead) storing the
> same file twice... but of course you should have gotten a 2:1 ratio when storing
> the file and its separate parts too, so I'd like to see if this works properly.
>
Hi Colin,
I ran those tests as shown, and each one worked as expected with a 2:1
ratio. Then I ran a few more sequences, in case they help:
sh-3.2# tarsnap --dry-run -cvf testarchive file1 file1 part1
a file1
a file1
a part1
Total size Compressed size
All archives 236079049 237256169
(unique data) 105008116 105526431
This archive 236079049 237256169
New data 105008116 105526431
sh-3.2# tarsnap --dry-run -cvf testarchive file1 file1 part1 part2
a file1
a file1
a part1
a part2
Total size Compressed size
All archives 288540303 289974299
(unique data) 129278071 129911860
This archive 288540303 289974299
New data 129278071 129911860
sh-3.2# tarsnap --dry-run -cvf testarchive file1 file1 part1 part2 part3
a file1
a file1
a part1
a part2
a part3
Total size Compressed size
All archives 314771757 316338103
(unique data) 138220066 138902042
This archive 314771757 316338103
New data 138220066 138902042
The first sequence looks ok (file1 file1 part1), but after that, the
"New data" number increases more than expected. I have another Mac, and
I'll try the same tests to see if there's any machine-specific issue here.
Thank you,
Greg