[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Expected deduplication doesn't take place



Hi Igor,
When you compress files, a single different byte changes the whole content of the compressed file. That's why the deduplication it's not working, because all the blocks of all the compressed files you ever uploaded to tarsnap are different to each other.
If you have to upload your files compressed, try gzip --rsyncable, it makes your compressed files work better with rsync and also with tarsnap.
Hope it helps.
--
Mauro Ciancio

On Tue, Jan 19, 2016 at 6:12 AM, Igor Ostapenko <igor.ostapenko@gmail.com> wrote:
Hi,



Could you please give me a clue what's wrong in the following situation.



I have a directory with static content. Subsequent tarsnapping looks to

be without deduplication, like content was changed (but it's not). I've

taken one of its files for testing:



$ # The first run

$ tarsnap -cvf .test.daily.20160119104958 .test

a .test

a .test/file.tar.xz

                                       Total size  Compressed size

All archives                               8.3 GB           3.5 GB

  (unique data)                            1.4 GB           622 MB

This archive                                10 MB            10 MB

New data                                    10 MB            10 MB



$ # The second run

$ tarsnap -cvf .test.daily.20160119105034 .test

a .test

a .test/file.tar.xz

                                       Total size  Compressed size

All archives                               8.3 GB           3.5 GB

  (unique data)                            1.4 GB           622 MB

This archive                                10 MB            10 MB

New data                                    10 MB            10 MB



$ ls -nl .test

total 10580

-rw-------  1 501  20  10830836 19 Jan 10:49 file.tar.xz



$ tarsnap --version

tarsnap 1.0.36.1





I have a bunch of *.gpg files in another archive where deduplication

works as expected.



It's a need to keep files in tar.xz. Anyway, conceptually I expected

deduplication in this case.





What do you think?